Wednesday, June 18, 2014
Sparseness of Linguistic Big Data
Interesting comment that we can confirm, but it's not unlike many events, statistically rare events are often problematic. In Language Log: At first I disagreed, we generate huge amounts of textual content every day. But consider: " ... Big data is at its best when analyzing things that are extremely common, but often falls short when analyzing things that are less common. For instance, programs that use big data to deal with text, such as search engines and translation programs, often rely heavily on something called trigrams: ..... "
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment