Modern Data Mining

STAT 471/571/701


Professor Linda Zhao


Spring 2021

TR 1:30-3:00pm


Statistics has been evolving rapidly in the era of big data and provides tools to harvest knowledge from big data. Focusing on methodologies with reasoning, the class brings in a large set of cutting edge machine learning techniques with applications. Hands-on data experience with R throughout the semester is another feature. The class will begin with data acquisition and exploratory data analysis (EDA) along with tools for reproducible report, an essential part of data science. We next show how to build, interpret, and adopt basic models; then go beyond to contemporary methods and techniques for handling large and complex data with applications in finance, marketing, medical fields, social science, entertainment, you name it. While this course extensively uses the statistical programming language R, no programming experience is required. By the end of the semester, students will master popular modern statistical methods but also get equipped with hands-on skills in handling data of essentially any size.

2014

First Course

1200+

Students

500+

Projects

4

Data Science Live