This course is designed as an introductory class to big data analysis (data sets with over 1000 rows/observations). In this ever-changing world, more and more industries are adopting a data-driven mentality to make reliable predictions. For example, what kind of mail recipients are more likely to respond and donate their money? How to predict employee churn rate? How to prevent student attrition? How to identify credit card fraud? How to improve the individualized recommendation list on a shopping website?
In this class, students will have opportunities to build their skills in the foundations and application of data examination, predictive modeling, and data visualization using large data sets. Specifically, they will:
- Transform and cleanse large data sets into understandable subsets for model development.
- Build multiple predictive models and select the best one for practical use in areas like finance, retail, medical, biological, etc.
- Use statistics tools such as IBM Watson Analytics, Python, Rstudio, Tableau for data analysis.
- Apply statistical techniques for data analysis and interpret and communicate the results for decision making (like a consulting firm).
This course relies heavily on the use of software for analysis. Therefore, no prior statistical experience is required. However, to understand the process of machine learning, students must successfully finish Pre-Calculus.
Prerequisites: Algebra II and Geometry
(not available 2020-2021)