MLEARN510-Notes
Course 1: Introduction to Machine Learning
These are my personal study notes for the first course in the Machine Learning Professional Certificate Program from the University of Washington. The lectures are based on the book Introduction to Statistical Learning by James, Witten, Hastie, and Tibshirani.
Table of Contents
Week | Topic | Concepts |
---|---|---|
1 | Course intro | AI vs. ML, Types of ML, Performance Metrics, Bias-Variance Tradeoffs, Parametric vs. Non-parametric Methods, K-Nearest Neighbors (KNN), Matrix Algebra |
2 | Linear Regression | Maximum Likelihood Estimation (MLE), Simple Linear Regression, Multiple Linear Regression |
3 | Classification | Logistic Regression, Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), K-Nearest Neighbors (KNN) |
4 | Model Building Part 1 | Data Preprocessing, Handling Outliers & Class Imbalance & Missing Data, Feature Engineering, Feature Selection, Data Splitting |
5 | Model Building Part 2 | Feature Engineering, Feature Selection, Resampling |
6 | Resampling Methods | Cross-Validation, Bootstrapping, Leave-One-Out Cross-Validation (LOOCV) |
7 | Linear Model Selection and Regularization | Subset Selection, Shrinkage Methods (Ridge Regression, Lasso Regression) |
8 | Dimension Reduction Methods | Principal Component Analysis (PCA), Principal Components Regression (PCR), t-Distributed Stochastic Neighbor Embedding (t-SNE) |
9 | Forecasting | Time Series Data, Moving Average, Exponential Smoothing, ARIMA |
10 | Frequent Itemset Mining | Association Rules, Apriori Algorithm, FP-Growth Algorithm, Maximal and Closed Frequent Itemsets |
Homework Assignments
Unlike the lectures, which takes a theoretical approach to machine learning, the homework assignments are more hands-on and practical. The assignments are done in Python using Jupyter Notebooks with extensive of the scikit-learn
library.
For academic integrity reasons, I won't be posting my code here. However, I will provide a brief overview of the key concepts covered in each homework assignment.
Clarification on Content Creation
While my understanding of the subject matter originates from attending the class, the majority of my notes were based on the provided slides. These notes can be a bit messy and may potentially breach copyright terms. To have a digital, easily accessible version of my notes, I opted to work with ChatGPT to refine and present the content in a more organized manner. Given that these topics are common knowledge and widely discussed online, the information rendered by ChatGPT is generally trustworthy.