Context always matters. Text mining or elsewhere. Good example here.
Context Matters When Text Mining Posted by Dalila Benachenhou
Many times the most followed approach can result in failure. The reason has more to do with thinking that one approach works in all cases. This is specially true in text mining. For instance, a common approach in clustering documents is to create tf-idf matrix for all documents, use SVD or other dimension reduction algorithm and then use a clustering. In most cases, this will work; However, as I will present here, there are instances where this process will not provide the intended result. It will not work because the subject characteristic, or the context where the approach is used. Recipe reviews is one of these instances.
To show the importance of context driven text mining, I will use recipe reviews as example, more precisely Enchiladareviews. You can find the cleaned dataset (tokenized by sentence structure and words, stop-words removed, and lower cased words) in github, and full description in my previous blog. I sampled 239 reviews, or 1616 sentences. .... ?