A Critical Introduction to Machine Learning
Thursday, September 19, 2019 — 1:45PM - 2:45PM
What is machine learning? Textbooks give aspirational, inward-facing definitions; public treatments are vague, hype-filled, and misleading; and critics of machine learning seldom give a sense of its technical content. This introduction gives a practical overview and tutorial of machine learning focusing on its key strength and limitation: finding robust statistical correlations to use as “predictions.” Machine learning is more flexible than statistics in the correlations it manages to find, but has limitations relating to the circumstances in which we would want to make predictions, the meanings of prediction, and the ways in which previously observed correlations can fail to generalize. This workshop presents machine learning in the context of a key distinction between modeling whose goal is prediction and modeling whose goal is explanation, including the counterintuitive trade-off between the two goals, and takes participants through an applied case illustrating the difference. Using the Titanic dataset example from a Datacamp tutorial, the starting point of a critique of machine learning from Meredith Broussard’s “Artificial Unintelligence: How Computers Misunderstand the World” (MIT Press, 2018), this workshop will have participants build a decision tree in R for ‘predicting’ survival aboard the Titanic. This will lead into core issues of overfitting, and of predictive vs. explanatory modeling. Going beyond existing materials on this dataset, the machine learning approach will be contrasted with both a social statistical approach modeling the relationship between survival and demographics, and a humanistic approach that looks at narrative aspects of lives lost.
Momin M. Malik, Data Science Postdoctoral Fellow, Harvard University