/* ---- Google Analytics Code Below */

Wednesday, April 22, 2015

Benford's Law for Twitter Data

Nicely done Tech Review piece on Benford's Law, a practical piece of pattern recognition knowledge every analyst should know.  It is used in fraud and risk analysis.  It finds strangeness in certain kinds of patterns.   Learned it early in my career.   Turns out it now has value even in the analysis of social media.  It's discovery is a fun case study by itself, which makes the point that you should aim to dive into the physical whenever given only abstract data:

" ... The counterintuitive distribution of digits in certain data sets turns out to be a powerful tool for detecting strange behavior on social networks.  It turns out it is even finding use in the analysis of social networks. 

Back in the 1880s, the American astronomer Simon Newcomb noticed something strange about the book of logarithmic tables in his library—the earlier pages were much more heavily thumbed than later ones implying that people looked up logarithms beginning with “1” much more often than “9.”
After some investigation, his concluded that in any list of data, numbers beginning with the digit “1” must be much more common than numbers beginning with other digits. He went on to formulate mathematical rationale behind this phenomenon, which later became known as Benford’s law, after the physicist Frank Benford who discovered it independently some 50 years later. ... " 

Following the quote is a look how it applies to social data.  Includes link to a technical paper.

No comments: