Determine Language of Text
July 14th, 2009Wan’t to know what language a text is written in? Maybe a comment on your Blog, an e-Mail you received, let your CMS guess the language of an article? Enter the babel gem!
Babel uses a n-gram approach to build reference profiles for languages. It also builds a profile for the input text and then calculates the distance between each reference profile and the input profile. The closest profile is probably the one with the language of the input text. Proabably. Chances are that Babel gives you wrong results on short sentences or single words. You want to know more? Here is a paper that describes and compares different approaches.
We’ve setup a demo app so you can try out babel. And of course you can get the gem from github.


