/* ---- Google Analytics Code Below */

Friday, April 25, 2008

Captchas and ALIPR

I wrote recently about the hacker cracking of Captchas, those distorted letter problems you see when you sign up for new online services.

This is very serious since it has allowed the release of millions of spam messages from bots posing as people. I have also been looking at methods for identifying things in arbitrary images and using them to tag pictures with identifying descriptors.

One of the most interesting approaches to this latter problem was developed at Penn State and is called ALIPR (Automatic Linguistic Indexing of Pictures) It's a fascinating mathematical programming method that can assign sets of tags to images. It turns out that this problem is one of the most difficult immediately practical AI problems known. It is also the case that the ALIPR approach to tagging photos is not perfect, where any particular tagging of a photo includes wrong tags. A recent Cyber Cynic blog post describes how the ALIPR method is being reapplied to the now broken Captcha method, with links to demonstrations of the method. For those with the math background, this paper gives a perceptive view of the ALIPR method.

No comments: