/* ---- Google Analytics Code Below */

Tuesday, November 19, 2019

Avoiding Statistical Traps

 In KDNuggets.  One of the first things that practitioners in analytic spaces should study and understand.  I will also add that these should always be demonstrated at scale.    We  typically had examples with realistic scale and data types for the domain being considered.    Using general statements of the dangers, with small or out of domain data sets, usually did not make the dangers clear.

   Fallacies are what we call the results of faulty reasoningBy Matthew Mayo, KDnuggets. Statistical fallacies, a form of misuse of statistics, is poor statistical reasoning; you may have started off with sound data, but your use or interpretation of it, regardless of your possible purity of intent, has gone awry. Therefore, whatever decisions you base on these wrong moves will necessarily be incorrect.

There are infinite ways to incorrectly reason from data, some of which are much more obvious than others. Given that people have been making these mistakes for so long, many statistical fallacies have been identified and can be explained. The good thing is that once they are identified and studied, they can be avoided. Let's have a look at a few of these more common fallacies and see how we can avoid them.

Out of interest, when misuse of statistics is not intentional, the process bears a resemblance to cognitive biases, which Wikipedia defines as "tendencies to think in certain ways that can lead to systematic deviations from a standard of rationality or good judgment." The former builds incorrect reasoning on top of data and its explicit and active analysis, while the latter reaches a similar outcome much more implicitly and passively. That's not hard and fast, however, as there is definitely overlap between these 2 phenomena. The end results is the same, however: plain ol' wrong.

Here are five statistical fallacies — traps — which data scientists should be aware of and definitely avoid. The failure to do so will be catastrophic in terms of both data outcomes and a data scientist's credibility.  .... " 

No comments: