Tuesday, December 09, 2008

Text Classification: uClassify

Have not looked at this in detail yet, but ReadWriteWeb writes about a public service for doing text classification ... Sounds like a great place to start a mash-up ... It is already used in applications like trying to determine the gender of a writer. That system, the Gender analyzer (Beta), determines that this blog is gender neutral, tipping slightly towards male (57%). Indicates it uses AI techniques. Depending how you interpret the correctness in his post-survey, it gets gender right about 55% of the time! Still under construction. I would think that there are other AI techniques that could be provided this way.

uClassify: Ever wanted to know the language of a Web site? Or whether the text within it is considered spam? Well, it's a lot easier since the launch of uClassify, the free Web service and API out of Sweden that lets you create and train your own text classifiers.

According to uClassify's about page, a text classifier answers the question: "To which predefined category is this text most likely to belong?" Text classifiers can be used to create spam filters, categorize Web pages, detect languages, classify a batch of blog posts, and more ... "

