Monday, April 09, 2018

Datasets for Teaching Data Science

Useful to have for testing and teaching concepts.  With sample code.

Some datasets for teaching data science
 By Rafael Irizarry  2018/01/22

In this post I describe the dslabs package, which contains some datasets that I use in my data science courses.

A much discussed topic in stats education is that computing should play a more prominent role in the curriculum. I strongly agree, but I think the main improvement will come from bringing applications to the forefront and mimicking, as best as possible, the challenges applied statisticians face in real life. I therefore try to avoid using widely used toy examples, such as the mtcars dataset, when I teach data science. However, my experience has been that finding examples that are both realistic, interesting, and appropriate for beginners is not easy. After a few years of teaching I have collected a few datasets that I think fit this criteria. To facilitate their use in introductory classes, I include them in the dslabs package:  .... " 

