/* ---- Google Analytics Code Below */

Tuesday, April 01, 2014

Topic Modeling in Machine Learning

I see that David Blei has been given the ACM-InfoSys award.  In reading about his work in textual analysis I saw this was related to work we had done in 'content analysis' years ago, analyzing text from consumer comments.   But making it far more sophisticated and useful. This is an extension of what is called topic modeling now.  Here is a non technical piece on topic modeling.   Blei et al wrote a paper on their extensions, which use Bayesian methods.   Here is a technical description of  that.  And advanced topic models in R.  All this is worth understanding for the state of this art today.  More in this blog on text analytics.

" ... David Blei is the recipient of the 2013 ACM-Infosys Foundation Award in the Computing Sciences. He initiated an approach to analyzing large collections of data using innovative statistical methods, known as "topic modeling," that make it possible to organize and summarize digital archives at a scale that would be impossible by human annotation.  His work is scalable to collections of billions of documents and has inspired new research programs across multiple disciplines, with applications for email archives, natural language processing, information retrieval, computational biology, social networks, and robotics as well as computational social sciences and digital humanities. ... " 

Faculty page at Princeton.

I see that Blei also has an excellent relatively non-technical introduction to Topic Modeling and pointer to additional resources that I am exploring.

No comments: