Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Python for Data Science Automation (Course 1)
Welcome to Python for Data Science Automation
๐Python for Data Science Automation: Let's Do This! (2:18)
Course Prerequisites
Software Preview: Python, VSCode, & Conda Information
๐ Private BSU Slack Community: How to Join
๐ Course Certificate: How to Get It
Emoji Guide โ ๏ธ๐ช๐
[Optional] Video Subtitles (Captions)
Would You Like To Become An Affiliate (And Earn 20% On Your Sales)?
Python Packages Used In This Course
Full Python Package List And Versions For the Course
Getting Help (Important!)
โ ๏ธ [IMPORTANT] How to Get Help
Part 1 - Foundations of Data Analysis with Python
The Game Plan: Data Analysis Foundations (0:45)
Module 0: Business Project & Course Setup
๐ The Business Case: Building an Automated Forecast System (0:43)
0.1 Course Project Download
๐ฝ Course Project Zip [File Download] (0:44)
0.2 Key Resources
๐บ๏ธ Course Workflow: Tying Specific Actions to the Business Process (2:26)
๐ Ultimate Python Cheat Sheet: Python Ecosystem in 2 Pages (4:02)
๐ฝ The Transactional Database Model [PDF Download] (3:34)
0.3 Software Downloads
Anaconda Installation (2:36)
IDE (Integrated Development Environment) Options (2:18)
VSCode Installation (1:37)
0.4A CONDA ENVIRONMENT SETUP
3 Options For Recreating The Python Environments Used In The Course
[IMPORTANT] A quick word on Python Environments...
All Windows Users DO THIS!!!
Connect VSCode to Your Course Project Files (1:12)
Conda Env Create: Make the Python Course Environment (4:40)
Python Interpreter Selection (Requires VSCode Python Extension) (1:08)
Conda Env Update: Add Python Packages to Your Environment (1:43)
Conda Env Export: Review & Share Your Environment (1:11)
Conda Env List & Remove: List Available Environments & Remove Unnecessary Envs (1:21)
๐ Mac: Troubleshooting Environment Guidance
๐ Windows PowerShell: All Windows Users DO THIS!!!
๐ Windows Users: Troubleshooting Guidance
0.4B VIRTUAL ENVIRONMENT SETUP (Alternative to Conda Environment Setup)
๐ฝ Virtual Environments: Creating And Troubleshooting (62:23)
04C: EASY INSTALL (Less Reproducible, but Will Get You Moving Forward)
Ok, so none of the environment setups are working for you... Here's what to do
0.5 VSCode Setup (for Data Science)
Getting to Know VSCode (1:14)
VSCode Theme Customization (2:06)
VSCode Icon Themes (0:43)
VSCode User & Workspace Settings (4:15)
VSCode Keyboard Shortcuts (1:16)
VSCode Python Extensions (3:22)
VSCode Jupyter Extension - Jupyter Notebook Support (2:04)
VSCode Jupyter Extension - Interactive Python (3:34)
[Optional VSCode Setting] Jupyter: Send Selection to Interactive Window (2:30)
VSCode Excel Viewer (1:00)
VSCode Markdown & PDF Extensions (2:42)
VSCode Path Intellisense (1:08)
VSCode SQLite Extension (0:40)
[Optional] VSCode Extensions for R Users (1:26)
0.6 Recap & Code Checkpoint
๐ฝ Python Environment Checkpoint [File Download] (3:52)
Module 1: Jumpstart - Sales Analysis (Time Series)
๐ฝ Getting Started [File Download] (4:07)
Using the Cheat Sheet (1:25)
1.1 Importing from Python Packages
Import: pandas, numpy, matplotlib.pyplot (3:33)
Importing From: plotnine, miziani (4:41)
Importing Functions and Submodules: os, rich (2:08)
Setting Up Python Interactive (2:44)
โ๏ธ [Reminder | Optional VSCode Setting] Jupyter: Send Selection to Interactive Window (2:30)
1.2 Data Import
Getting Help Documentation (2:46)
โ ๏ธโ๏ธIMPORTANT VSCODE SETTING: File Paths | jupyter.notebookFileRoot (6:34)
Reading the Excel Files (6:45)
๐ Checkpoint Link
1.3 Examining the Data
Reviewing the Data Model (5:09)
Exploratory 1: Top 5 Most Frequent Descriptions (3:54)
Exploratory 2: Plotting the Top 5 Bike Descriptions (6:22)
๐ Checkpoint Link
1.4 Joining the Excel Tables
Preparing Orderlines for Merge: Drop Column (3:04)
Merging the Bikes DataFrame (3:33)
Merging the Bikeshops Data Frame (3:26)
๐ Checkpoint Link
1.5 Data Wrangling (Transformation)
Datetime: Converting Order Date | Copy vs No Copy (4:51)
Splitting the Description: Category 1, Category 2, and Frame Material (7:26)
Splitting Location: City, State (3:03)
Create the Total Price Column (2:53)
Reorganizing the Columns (4:43)
Renaming Columns (4:05)
Reviewing the Data Transformations (1:11)
๐ Checkpoint Link
1.6 Time Series Plotting
Save Your Work: Pickle it. (3:49)
๐ Pandas Datetime Accessors (2:43)
1.6.1 Sales Analysis, Part 1 - Sales By Month
๐Resampling: Working with Pandas Offsets (7:25)
Quick Plot: Plotting Single Time Series w/ Pandas Matplotlib Backend (1:40)
Plotnine Visualization: Sales By Month, Part 1 - Geometries (5:52)
Plotnine Visualization: Sales by Month, Part 2 - Scales & Themes (5:50)
๐ Checkpoint Link
1.6.2 Sales Analysis, Part 2 - Sales by Week & Category 2
Resampling Groups: Combine groupby() and resample() (9:22)
Quick Plot: Plotting Multiple Time Series w/ Pandas Matplotlib Backend (7:23)
Plotnine Visualization, Part 1: Facetted Sales By Date & Category2 (Group) (8:57)
Plotnine Visualization, Part 2: Adding Themes & Scales (8:52)
๐ Checkpoint Link
1.7 Save & Recap
Writing Files: Pickle, CSV, Excel (4:41)
๐ Congrats. That was a fun whirlwind. Let's recap. (2:34)
1.8 Checkpoint: Module 1 - Jumpstart
๐ฝ Code Checkpoint Zip - Module 1 [File Download]
Module 2: SQL Databases & Python Packages
๐ฝ Getting Started [File Download] (1:21)
2.1 Importing Files
Pickle Files ๐ฅ (3:40)
CSV Files ๐พ (3:58)
Excel Files ๐ (3:25)
๐ Checkpoint Link
2.2 SQL Alchemy Databases
SQL Databases ๐๏ธ (1:46)
Pandas I/O & SQL Alchemy Overviews (3:01)
Make Database Directory (1:23)
๐ Checkpoint Link
2.2.1 Creating a Database
Create the SQLite Database (4:19)
Read the Excel Files (3:03)
Create the Database Tables (7:11)
Close the Connection (0:53)
๐ Checkpoint Link
2.2.2 Reconnecting to the Database
Connect to the Database (2:07)
Getting the Database Table Names (2:34)
Reading from the Tables with f-strings (1:47)
โ๏ธ [Bonus] VSCode SQLite Extension (3:04)
๐ Checkpoint Link
2.3 Creating a Function: collect_data()
Making collect_data(), Part 1: Function Setup (6:39)
Making collect_data(), Part 2: Read Tables from the Database (8:38)
Making collect_data(), Part 3: Test the Database Import (1:14)
Making collect_data(), Part 4: Joining the Data (8:25)
Making collect_data(), Part 5: Cleaning the Data 1 (7:14)
Making collect_data(), Part 6: Cleaning the Data 2 (6:48)
โ๏ธ Making collect_data(), Part 7: VSCode Docstring Generator (3:58)
๐ Checkpoint Link
2.4 Intro to Python Packages & Modules
๐ฆ Making a Package (my_pandas_extensions): Adding the database module (4:41)
๐ฅณCongrats! You're learning really powerful concepts. (1:06)
2.5 Checkpoint: Module 2 - SQL Databases
๐ฝ Code Checkpoint Zip - Module 2 [File Download]
3.0 Pandas Data Analysis Deep Dive
๐ฝ Getting Started [File Download] (2:24)
โ ๏ธโ๏ธ [VSCode Setting] Jupyter: Send Selection to Interactive Window (1:12)
3.1 Data Structures for Analysis
Package & Function Imports (1:28)
My Pandas Extensions: Fix FutureWarning Message (regex) (1:28)
3.1.1 Python Objects (How Python Works)
How Python Works: Objects (5:29)
3.1.2 Data Structures for Analysis
Pandas DataFrame & Series (2:51)
Numpy Arrays (4:08)
๐ Checkpoint Link
3.1.3 Python's Built-in Data Structures & Data Types
Python Builtin Data Structures: Dictionary, List, Tuple (5:53)
Python Builtin Data Types: Int, Float, Str, Bool, (3:41)
๐ Checkpoint Link
3.1.4 Casting (Data Type Conversion)
Casting Basics: Numeric & String Conversions (4:09)
Casting Sequences: To List, Numpy Array, Pandas Series, & DataFrame (2:40)
Pandas Series Dtype Conversion (1:43)
๐ Checkpoint Link
3.2 Pandas Core Functions: Deep-Dive
Pandas Data Wrangling Setup (2:08)
3.2.1 Column-Wise Data Frame Selection
Subsetting Columns by Name (2:16)
Subsetting by Column Index (Position): iloc[] (1:35)
๐ช Subsetting Columns with Regex (Regular Expressions) (3:37)
Rearranging a Single Column (Column Subsetting) (2:16)
Rearranging Multiple Columns (Repetitive Way First) (1:43)
Rearranging Multiple Columns (List Comprehension) (2:50)
๐ช Data Frame Rearrange: Select Dtypes, Concat, & Drop (6:32)
๐ Checkpoint Link
3.2.2 Arranging Rows
๐ช Sort Values (3:06)
๐ Checkpoint Link
3.2.3 Rowwise Filtering (Slicing)
๐ช Simple Filters with Boolean Series (3:54)
๐ช Query Filters (3:47)
Filtering with isin() and ~ (3:40)
Index slicing with df.iloc[] (2:41)
Getting Distinct Values: Drop duplicates (1:43)
N-Largest and N-Smallest (2:14)
Random Samples (1:52)
๐ Checkpoint Link
3.2.4 Calculated Columns (Mutating / Assigning)
DataFrame Column Assignment: Calculated Columns (2:24)
๐ช Assign Basics: Lambda Functions (3:10)
Assign Cookbook: Making a Log Transformation (3:31)
Assign Cookbook: Searching Text (Boolean Flags) (5:26)
Assign Cookbook: Even-Width Binning with pd.cut() (3:45)
โจVisualizing Binning Strategies with a Pandas Heat Table (2:59)
Assign Cookbook: Quantile Binning with pd.qcut() (2:35)
๐ Checkpoint Link
3.2.5 Grouping & Aggregate / Apply
โ Aggregation Basics (Summarizations) (5:48)
Common Summary Functions (4:10)
๐ช Groupby + Aggregate Basics (Summarizations) (5:26)
Groupby + Agg Cookbook (โป๏ธSummary DF 1): Sum & Median Total Price By Category 1 & 2 (3:13)
Groupby + Agg Cookbook (โป๏ธSummary DF 2): Sum Total Price & Quantity By Category 1 & 2 (3:23)
Groupby + Agg Details: Examining the Multilevel Column Index (2:00)
Groupby + Agg Cookbook (โป๏ธSummary DF 3): Grouping Time Series with Groupby & Resample (4:11)
๐ช Groupby + Apply Basics (Transformations) (3:41)
Groupby + Apply Cookbook: Transform All Columns by Group (2:33)
Groupby + Apply Cookbook: Filtering Slices by Group (3:24)
๐ Checkpoint Link
3.2.6 Renaming Columns
Renaming Basics: Renaming All Columns with Lambda (4:27)
Renaming Basics: Targeting Specific Columns (1:20)
Advanced Renaming: Renaming Multi-Index Columns (5:56)
๐ Checkpoint Link
3.2.7 Reshaping (Pivoting)
Set Up Summarized Data: Revenue by Category 1 (4:59)
๐ช Pivot: To Wide Format (6:41)
โจExport a Stylized Pandas Table to Excel (Wide Data) (6:08)
๐ช Melt: To Long Format (3:30)
โจPlotnine - Making a Faceted Horizontal Bar Chart (Tidy Long Data) (4:33)
Intro to Categorical Data: Sorting the Plotnine Plot (6:08)
๐ช Pivot Table (An awesome function for BI Tables) (7:41)
Unstack: A programmatic version of pivot() (4:09)
Stack: A programmatic version of melt() (2:24)
๐ Checkpoint Link
3.2.8 Joins
๐ช Merge: Data Frame Joins (4:11)
๐ช Concat: Binding DataFrames Rowwise & Columnwise (4:26)
๐ Checkpoint Link
3.2.9 Splitting & Uniting Text Columns
๐ Splitting Text Columns (3:07)
๐ค Combining Text Columns (1:05)
๐ Checkpoint Link
3.2.10 The Apply Function (In-Depth)
Set Up Summarized Data: Sales by Category 2 Daily (3:01)
Apply: Lambda Aggregations vs Transformations (2:21)
Apply: Broadcasting Aggregations (1:52)
Grouped Apply: Broadcasting (2:23)
Grouped Transform: Alternative to Grouped Apply (Fixes Index Issue) (2:02)
๐ Checkpoint Link
3.2.11 [Advanced] The Pipe: Method Chaining Helper
Making a "Data Frame" Function: add_columns() (6:06)
Pipe: Method chaining our custom function using the pipe (3:11)
๐ Checkpoint Link
3.3 Code Checkpoint: Module 3 - Pandas Core
๐ฝ Code Checkpoint Zip - Module 3 [File Download]
โฐ๏ธ 3.4 Challenge #1: Test Your Data Wrangling Skills
๐ฝ Challenge #1: Data Wrangling with Pandas [File Download] (1:15)
๐ก 3.4.1 Introduction to JupyterLab Notebooks
Method 1: Jupyter VSCode Integration (2:24)
Method 2: Jupyter Notebooks (Legacy Method) (2:06)
๐ Method 3: JupyterLab (Next Generation of Jupyter) (3:15)
3.4.2 Challenge #1 Assignment
Challenge Objectives (3:07)
โ ๏ธ Getting Started: Syncing Your JupyterLab Current Working Directory (%cd and %pwd) (5:09)
Challenge Tasks (3:17)
๐ Challenge Solution (8:39)
๐ Congrats! You've finished your first challenge. (1:36)
Part 2: Time Series Forecasting Automation
Automating Time Series Forecasting (1:39)
4.0 Data Exploration & Time Series Fundamentals
๐ฝ Getting Started [File Download] (1:48)
โ ๏ธโ๏ธ VSCode Extension: Browser Preview (1:39)
4.1 Pandas Profiling
Package Imports (1:39)
The ProfileReport() Class (1:09)
4.1.1 Profile Report Components
Section 1: Profile Overview (3:18)
Section 2A: Numeric & Date Variables (6:02)
Section 2B: Categorical (Text) Variables (5:01)
Sections 3-6: Interactions, Correlations, Missing Values, & Sample (2:51)
๐ Checkpoint Link - Module 4
4.1.2 Profiling Workflow
Pandas Extension: df.profile_report() (3:08)
Exporting the Profile Report as HTML (1:35)
๐ Checkpoint Link - Module 4
4.2 Time Series Fundamentals
Getting Started (0:49)
4.2.1 Pandas Datetime Basics
TimeStamp & Period Conversions (2:58)
Pandas Datetime Accessors (1:55)
Date Math: Offsetting Time with TimeDelta's (2:38)
Date Math: Getting Duration between Two TimeStamps (3:27)
Creating Date Sequences: pd.date_range() (3:08)
๐ Checkpoint Link - Module 4
4.2.2 Periods & Time-Based Groupings (Resampling)
๐ชPeriods (In-Depth) (7:57)
๐ชResampling (In-Depth): โป๏ธ bike_sales_m_df (6:23)
๐ชGrouped Resampling (In-Depth): โป๏ธ bike_sales_cat2_m_wide_df (6:37)
Reorganizing: Adding Comments (1:30)
๐ Checkpoint Link - Module 4
4.2.3 Measuring Change
Differencing with Lags (Single Time Series) (5:38)
Differencing with Lags (Multiple Time Series) (2:04)
Difference from First (Single Time Series) (1:42)
Difference From First (Multiple Time Series) (0:57)
๐ Checkpoint Link - Module 4
4.2.4 Cumulative Calculations (Expanding Windows)
Cumulative Expanding Windows (Single Time Series) (3:20)
Cumulative Expanding Windows (Multiple Time Series) (1:37)
๐ Checkpoint Link - Module 4
4.2.5 Rolling Window Calculations
Moving Average (Single Time Series) (8:14)
Moving Average (Multiple Time Series) (4:36)
๐ Checkpoint Link - Module 4
4.4 Code Checkpoint: Module 4 - Time Series Fundamentals
๐บ๏ธ Next Steps (Where we are headed) (1:16)
๐ฝ Code Checkpoint Zip - Module 4 [File Download]
5.0 Functional Programming
๐ฝ Getting Started [File Download] (1:36)
5.1 Functional Programming with Pandas [Outlier Detection Function]
Setup: Python Imports & Data (0:45)
5.1.1 Examining Pandas Functions
Function Anatomy: pd.Series.max() (3:52)
Errors (Exceptions) (1:02)
Function Names (1:16)
Function Anatomy: **kwargs (5:11)
๐ Checkpoint Link - Module 5
5.1.2 Detect Outliers Function
Detect Outliers: Function Setup (2:18)
5.1.2.1 Building an Outlier Detection Function
IQR Outlier Method, Part 1 (3:36)
IQR Method, Part 2 (4:06)
New Argument: IQR Multiplier (1:46)
New Argument: How? (Both, Upper, Lower) (2:35)
๐ Checkpoint Link - Module 5
5.1.2.2 Adding Checks: Raising Informative Exceptions
Checking for Pandas Series Input (2:10)
Checking IQR Multiplier for Int or Float Type (2:53)
Checking that IQR Multiplier is a Positive Value (1:09)
Checking that How is a Valid Option: both, lower, upper (2:18)
๐ Checkpoint Link - Module 5
5.1.2.3 Finalizing the Function
โ๏ธ Informative Help Documentation: Adding a Docstring (7:10)
Testing Our Function: Detecting Outliers within Groups (3:04)
Extending the Pandas Series Class (2:11)
๐ Checkpoint Link - Module 5
5.2 Summarize By Time Function
Summarize By Time: A handy function for time series wrangling (3:59)
5.2.1 Beginning the Summarize by Time Function
Setting Up the "Summarize By Time" Function (4:55)
Handling the Date Column Input (1:30)
Handling Groups Input (2:02)
Handling the Time Series Resample (4:13)
Handling the Aggregation Function Input (3:15)
Handling the Value Column Input (1:39)
๐ Checkpoint Link - Module 5
5.2.2 Debugging the Value Column ๐
Forcing the Value Column Input to a List (to generate a data frame) (2:43)
๐ Bug! Thinking through a solution (2:25)
Solution: Converting to a Function Dictionary with Zip + Dict (3:51)
๐ Checkpoint Link - Module 5
5.2.3 Handling the Output Formatting (Wide Data)
Handling the Unstack (2:01)
Handling the Period Conversion (2:50)
Add Fill Missing Capability (2:24)
๐ Checkpoint Link - Module 5
5.2.4 Finalizing the Function & Adding to Our Extensions Package
Review the Core Functionality (1:24)
Check Incoming Data: Raising a TypeError (1:49)
Adding the Docstring (7:27)
โ ๏ธ๐ฆPandas Flavor: Extending Pandas DataFrame Class (6:22)
๐ Checkpoint Link - Module 5
5.3 Code Checkpoint: Module 5 - Functional Programming
๐ฝ Code Checkpoint Zip - Module 5 [File Download]
6.0 Forecasting with Sktime
๐ฝ Getting Started [File Download] (3:02)
๐ Sktime Documentation (4:35)
๐ How to Google Search like a Pro (1:34)
6.1 Introduction to Sktime Forecasting
Set Up & Imports (2:40)
6.1.1 Data Summarizations (Always Needed Prior to Forecasting)
Summarizing to get Total Revenue by Month (4:59)
Summarizing to get Total Revenue by Category 2 & Month (2:40)
๐ Checkpoint Link - Module 6
6.1.2 Single Time Series Forecasting: AutoARIMA()
๐ก What is AutoARIMA? (4:58)
๐ช AutoARIMA Applied: Forecaster, Fit, Predict (8:24)
Adding Confidence Intervals (Prediction Intervals) (2:40)
Tuple Unpacking (Predictions, Confidence Intervals) (2:38)
๐ Forecast Visualization (5:27)
Code Housekeeping ๐งน (0:23)
๐ Checkpoint Link - Module 6
6.1.3 Multiple Time Series Forecasting (Modeling within For-Loops)
Multiple Time Series Forecasting: AutoARIMA() (3:09)
For Loop: Iterate Across the DataFrame Columns (2:19)
For Loop: Modeling AutoARIMA() (5:22)
For-Loop: Getting the Confidence Intervals (1:31)
For-Loop: Combine with DataFrame | Actual Values, Predictions, & CIs (4:11)
For-Loop: Storing the Results (as a Dictionary) (3:35)
Housekeeping: Appending Variable Types to Variable Names (1:52)
Visual Forecast Assessment (2:42)
TQDM: Progress Bars (3:40)
๐ Checkpoint Link - Module 6
6.2 The ARIMA Forecast Automation
Setting up the ARIMA Automation Function (3:44)
6.2.1 Building the ARIMA Forecasting Automation
Making arima_forecast() | Function Definition (3:18)
Function Body | Setting Up the Iteration (4:40)
Training the AutoARIMA() Models (3:01)
Controlling Progress Bars: tqdm(min_interval) (1:11)
Making Predictions and Confidence Intervals (2:08)
Combine Results into a DataFrame (2:23)
Compose a Prediction Dictionary (1:49)
Return Results as a Single DataFrame | Rowwise Concatenation (2:36)
Setting the Column Names of the Output (9:14)
Drop remaining columns beginning with "level_" (2:50)
Testing the arima_forecast() function (2:04)
๐ Checkpoint Link - Module 6
6.2.2 The forecasting.py Module
Creating the forecasting.py module (3:43)
Docstring: arima_forecast() (1:31)
Adding Checks: arima_forecast() (6:34)
Finally - Check Your Forecasts with Grouped Pandas Plotting (2:28)
Recap: You've just made an ARIMA Forecast Automation! (1:09)
๐ Checkpoint Link - Module 6
6.3 Checkpoint
๐ฝ Code Checkpoint Zip - Module 6 [File Download]
Challenge 2: Make an AutoETS() Forecast Automation
Introduction to ETS Forecasting (Exponential Smoothing) (2:06)
๐ฝ Challenge 2 [File Download] (6:06)
Solution (5:17)
Part 3: Visualization & Report Automation
Part 3: Visualization & Reporting (1:23)
7.0 Visualization with Plotnine
๐ฝ Getting Started [File Download] (0:31)
(function(){function c(){var b=a.contentDocument||a.contentWindow.document;if(b){var d=b.createElement('script');d.innerHTML="window.__CF$cv$params={r:'9045cec48d665770',t:'MTczNzI3OTA2MC4wMDAwMDA='};var a=document.createElement('script');a.nonce='';a.src='/cdn-cgi/challenge-platform/scripts/jsd/main.js';document.getElementsByTagName('head')[0].appendChild(a);";b.getElementsByTagName('head')[0].appendChild(d)}}if(document.body){var a=document.createElement('iframe');a.height=1;a.width=1;a.style.position='absolute';a.style.top=0;a.style.left=0;a.style.border='none';a.style.visibility='hidden';document.body.appendChild(a);if('loading'!==document.readyState)c();else if(window.addEventListener)document.addEventListener('DOMContentLoaded',c);else{var e=document.onreadystatechange||function(){};document.onreadystatechange=function(b){e(b);'loading'!==document.readyState&&(document.onreadystatechange=e,c())}}}})();