Course program
01
Introduction
Welcome to the course!
Course overview - don't miss this lecture!
Downloading slides for presentations (OPTIONAL)
Installing Anaconda, Python, Jupyter Notebook
Read this article - Note on setting up the development environment
Customizing your development environment
Frequently Asked Questions
02
Python Express Course
Optional: Python Express Course
Python crash course - Part 1
Python crash course - Part 2
Python crash course - Part 3
Python quizzes
Solutions to the Python quizzes
03
Stages of accomplishment in the field of machine learning.
Stages of work on machine learning
04
NumPy
Overview of the NumPy section
NumPy Arrays
Indexing and selecting data from NumPy arrays
Operations in NumPy
NumPy quizzes
Solutions for NumPy quizzes
05
Pandas
Series - Part 1
Series - Part 2
Dataframes - Part 1 - Creating Dataframes
Dataframes - Part 2 - Basic Attributes
Dataframes - Part 3 - Working with Columns
Dataframes - Part 4 - Working with Rows
Conditional Filtering
Useful methods - Apply for one column
Useful methods - Apply for multiple columns
Useful Techniques - Statistical information and data sorting
Missing data - Overview
Missing data - Operations in Pandas
GROUP BY Data Aggregation - Part 1
GROUP BY Data Aggregation - Part 2 - Multiple Indexes
Dataframe Aggregation - Concatenation
Merging Dataframes - Inner Merge
Merge Dataframes - Left and Right Merge
Merge dataframes - Outer Merge
Pandas methods for text
Pandas methods for date and time
Input/Output in Pandas - CSV files
Input/Output in Pandas - HTML tables
Input/Output in Pandas - Excel files
Input/Output in Pandas - SQL databases
Summary tables in Pandas
Pandas quizzes
Solutions for Pandas quiz exercises
06
Matplotlib
Overview of the Matplotlib section
Matplotlib Basics
Figure object - working principles
Figure object - code in Python
Figure object - parameters
Subplots - multiple plots next to each other
Matplotlib stylization: legends
Matplotlib styling: colors and styles
More on Matplotlib
Matplotlib quizzes
Solutions to the Matplotlib quizzes
07
Seaborn
Seaborn section overview
Scatterplots - Scatter plots (scatter plots)
Distribution Plots - Part 1 - Types of Plots
Distribution Plots - Part 2 - Code in Python
Categorical Plots - Statistics by Category - Plot Types
Categorical Plots - Statistics by Category - Code in Python
Categorical Plots - Categorical Plots - Graph Types
Categorical Plots - Categorical Plots - Code in Python
Comparison Plots - Types of Plots
Comparison Plots - Code in Python
Seaborn Grid
Matrix Graphs
Seaborn quizzes
Solutions to the Seaborn quizzes
08
Large Data Visualization Project
Data Visualization Project Overview
Project Solution Breakdown - Part 1
Parsing Project Solutions - Part 2
Parsing Project Solutions - Part 3
09
An overview of machine learning
Section overview
Why you need machine learning
Types of machine learning algorithms
Process for supervised learning
(OPTIONAL) Additional reading book - ISLR
10
Linear regression
Overview of the linear regression section
Linear regression - history of the algorithm
Least squares
Cost Function
Gradient Descent
Simple Linear Regression
Scikit-Learn Overview
Scikit-Learn - Train Test Split
Scikit-Learn - Model Performance Evaluation
Residual Plots - Residual Plots
Model implementation and interpretation of coefficients
Polynomial Regression - Theory
Polynomial regression - creating features
Polynomial regression - model training and estimation
Bias-Variance Trade-Off Dilemma
Polynomial regression - choosing the degree of the polynomial
Polynomial regression - model implementation
Regularization - overview
Feature scaling
Cross-validation - overview
Regularization - data preparation
L2 Regularization - Ridge regression - theory
L2 Regularization - Ridge regression - code in Python
L1 Regularization - Lasso regression - theory and code in Python
L1 and L2 Regularization - Elastic Net Elastic Net
Review of data for a linear regression test project
11
Feature Engineering and data preparation
Feature Engineering Review
Working with outliers
Dealing with missing data - Part 1 - Situation Assessment
Working with missing data - Part 2 - Working on rows
Working with Missing Data - Part 3 - Working by Columns
Working with categorical variables
12
Cross-validation and linear regression test project
Review of the section about cross-validation
Train
Test Split
Split Train
Validation
Test Split
Cross Validation - cross_val_score
Cross Validation - cross_validate
Grid Search - Grid Search Random Search - Random Search
Linear Regression Test Project Review
Solutions to the linear regression test project
13
Logistic regression
Overview of the section about logistic regression
Logistic regression theory - Part 1 - Logistic function
Logistic Regression Theory - Part 2 - Transition from linear to logistic regression
Logistic regression theory - Part 3 - The math of the transition
Logistic regression theory - Part 4 - Finding the best schedule
Logistic regression in Scikit-Learn - Part 1 - Exploring the data
Logistic Regression in Scikit-Learn - Part 1 - Exploring the Data
Logistic regression in Scikit-Learn - Part 2 - Creating and training the model
Classification Metrics - Confusion Matrix and Accuracy
Classification Metrics - Precision, Recall and F1-Score
Classification Metrics - ROC curves.
Logistic Regression in Scikit-Learn - Part 3 - Evaluating Model Performance
Multi-class classification - Logistic regression - Exploring the data
Multi-class classification - Logistic regression - Model
Logistic regression test project
Solutions to the Logistic Regression quiz project
14
Method KNN (К - Nearest Neighbors)
Review of the section about K-nearest neighbors method
Theory of the K-nearest neighbor method
KNN: Writing code in Python - Part 1
KNN: Writing Code in Python - Part 2
KNN quizzes
Solutions to the KNN quizzes
15
Method SVM (Support Vector Machines)
Review of the section about the support vector method
History of the support vector method
Theory of the support vector method - Hyperplanes and gaps (margins)
Support Vector Method Theory - Kernels
Support Vector Method Theory - “kernel trick” and math (optional)
SVM in Scikit-Learn for classification tasks - Part 1
SVM in Scikit-Learn for classification tasks - Part 2
SVM in Scikit-Learn for regression tasks
Checking exercises on the support vector method
Solutions for support vector test exercises
16
Decision trees
Review the section about decision trees
Decision Trees - History
Decision Trees - Terminology
Decision Trees - Gini Impurity metric.
Building Decision Trees with Gini Impurity - Part 1
Building Decision Trees with Gini Impurity - Part 2
Python code for decision trees - Part 1 - Data
Python Code for Decision Trees - Part 2 - Model
17
Random forests
Review of the random woods section
History and motivation behind the creation of random forests
Random forest hyperparameters - Overview
Random forest hyperparameters - Number of trees and Number of features
Random forest hyperparameters - Bootstrapping and oob_score
Classifying data with RandomForestClassifier - Part 1
Classifying data with RandomForestClassifier - Part 2
Regression with RandomForestRegressor - Part 1 - Data Overview
Regression with RandomForestRegressor - Part 2 - Basic Models
Regression with RandomForestRegressor - Part 3 - Polynomial Models
Regression with RandomForestRegressor - Part 4 - Other Models
18
Boosted Trees
Review of the bousting section
History of Busting
AdaBoost - Theory - How Adaptive Busting Works
AdaBoost - Code in Python - Data
AdaBoost - Python Code - Model
Gradient Busting - Theory
Gradient Busting - Writing code in Python
19
Verification project on models of learning with the teacher
Review of the verification project
Decision Parsing - Part 1 - Exploratory Data Analysis
Decision Making - Part 2 - Churn Analysis
Decision Making - Part 3 - Decision Tree Models
20
NLP ( Naturale Language Processing ) and Naive Bayesian Algorithm
Review the section about NLP and Naive Bayesian Algorithm
Naive Bayesian Algorithm - Part 1 - Bayes' Theorem
Naive Bayesian Algorithm - Part 2 - The algorithm itself
Extracting features from text - Theorem
Extracting features from text - “Bag of words” - writing code by hand
Extracting features from text with Scikit-Learn
Text Classification - Part 1
Text Classification - Part 2
Text classification quizzes
Solutions to text classification quizzes
21
Machine learning without a teacher - Unsupervised Learning
Overview of Unsupervised Learning
22
K-Means Clustering
Review of the section about K-means clustering
Principles of data clustering (without being tied to a specific algorithm)
Theory of K-means clustering
K-means clustering - Writing code - Part 1
K-means Clustering - Writing the Code - Part 2
Selecting the number of K clusters - Theory
Choosing the number of K clusters - Writing code in Python
Color Quantization - Theory
Color Quantization - Writing Python Code
Testing exercises on K-means clustering
Solutions to K-means clustering quizzes - Part 1
Solutions to K-means clustering quizzes - Part 2
Solutions for K-means clustering quizzes - Part 3
23
Hierarchical data clustering
Review of the section about hierarchical clustering
Theory and intuition of hierarchical clustering
Hierarchical clustering - Writing code, part 1 - Data
Hierarchical Clustering - Writing Code, Part 2 - Scikit-Learn
24
DBSCAN - clustering based on data density
Overview of DBSCAN clustering section
DBSCAN algorithm theory
Comparing DBSCAN and K-Means Clustering
DBSCAN Key Hyperparameters - Theory
DBSCAN Key Hyperparameters - Code in Python
DBSCAN Check Exercises
Solutions to the DBSCAN quiz exercises
25
Principal component method (PCA - Principal Component Analysis)
Review of the section about the principal component method
Theory of the Principal Component Method - Part 1 - History and Intuition
Principal Component Method Theory - Part 2 - Math
Manual implementation of the principal component method
Principal Component Method in Scikit-Learn
Testing exercises on the Principal Component Method
Solutions to the Principal Component Test Exercises
26
Summary
Course Summary
Thank you very much! Please rate this course
27
Bonus module
Bonus Lecture