Review of Statistics Done Wrong by Alex Reinhart
Since the advent of the big data era, organizations have been crying out for data scientists. Initially it was finding the true data scientist. But as these resources were considered scarce, it was a search for people who could hand-code analytical models in a Hadoop environment. As statistical tools such as R, Alteryx, and RapidMiner augmented Hadoop environments, we started to include traditional tools such as SAS and SPSS. These data scientists, or data scientists in training, were asked to take large amounts of data and divine the nuggets that would create a “cross-sell/up-sell” recommendation engine that would launch the next Netflix or link two disparate data sets that explain how markets interact and find the next groundbreaking investment opportunity.