Skip to content

SKawsar/Data-Preprocessing-for-ML

Repository files navigation

I taught Data Preprocessing with Python to 200+ students through an online platform. These are the course materials I built for my students (Mostly graduate-level students from the Non-CS background).

Video Lectures on YouTube: https://lnkd.in/gK5epasB

Medium blogs for pandas: https://kawsar34.medium.com/list/learn-pandas-from-leetcode-a06903853aed

Data Preprocessing with Python

Lecture 01: Importing Data with Pandas

  • Challenges of reading a CSV file
  • Understanding the data
  • Finding Data Statistics, data types and missing value information

Lecture 02: Data Preprocessing with Pandas (Part 1)

  • Challenges of reading a CSV or Excel file
  • Choose columns by name before reading a csv file
  • Choose columns by number before reading a csv file
  • Reading only the first n number of rows

Lecture 03: Data Preproccessing with Pandas (Part 2)

  • How to extract new information from a column?
  • How to create a column based on a condition or function?
  • Removing a string from a column
  • Checking the unique values for each column
  • performing calculation in dataframe columns
  • dataframe sorting

Lecture 04: HW review

Lecture 05: Handling Missing values

  • performing data cleaning
  • data visualization of missing values
  • string to datetime conversion
  • removing missing values
  • replacing missing values by: 1. mean, 2. median, 3. constant, 4. interpolation, 5. forward imputation, 6. backward imputation

Lecture 06: Data Joining using Pandas

  • inner join, outer join, left join, right join

Lecture 07: Data Aggregation/grouping and Pivot table using Pandas

  • Data filtering
  • Data preprocessing
  • Data Aggregation/grouping
  • Pivot table
  • Data Visualization: Barplot

Lecture 08: Data Correlation and Categorical Variable Encoding

  • Dealing with categorical variables
  • Label encoding
  • One-hot encoding

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published