This repository contains the code for building Logistic Regression from scratch.
Section 1: Understanding - Logistic Regression
1.1 Intorduction to Machine Learning
1.2 Intuition behind the Logistic Regression (LR)
1.3 Math behind the LR
1.4 Summary
Section 2: Building - Logistic Regression from scratch
Section 2.1: Apply LR using API of Scikit-Learn
Step 2.1.1: Import dependencies
Step 2.1.2: Load data
Step 2.1.3: Preprocessing
Step 2.1.4: Split data set into training and testing
Step 2.1.5: Train model using Scikit-learn's Logistic Regression
Step 2.1.6: Visualize the Dataset
Section 2.2: Apply LR from scratch
Step 2.2.1: Define sigmoid or logistic function
Step 2.2.2: Define hypothesis function
Step 2.2.3: Define cost function
Step 2.2.4: Define cost function derivative as well as calculating error
Step 2.2.5: Get the updated theta value by calculating gradient descent
Step 2.2.6: Define helper function to run Logistic Regression
Step 2.2.7: Compare our Implementation with Scikit-Learn API
Step 2.2.8: Run our own LR implementation
Section 3: Classify emails into Spam and Ham
Step 1: Import dependencies
Step 2: Load data
Step 3: Exploratory Data Analysis
Step 4: Transforming labels
Step 5: Split data set into training and testing
Step 6: Generating Features using CountVectorizer
Step 7: Train model
Step 8: Test model
Step 8.1: Measure Accuracy
Step 8.2: Confusion Matrix
Step 8.3 Area Under Curve
- Python 3.3+
- math
- numpy
- pandas
- scikit-learn
- matplotlib
- seaborn
- jupyter notebook
Install Python3 (install python 3.6.4)
-
Step 1: Download python form this link
-
Step 2: Refer this link or this link in oreder to install python
Install anaconda
-
Step 1: Download Anaconda 5.1 (python 3.6 version) using this link
-
Step 2: See the installation instruction given on this link
Note: If you have any other version of python then install anaconda which supports that particular version of python
Install dependencies using conda
numpy: In-built installed with anaconda
scipy: In-built installed with anaconda
scikit-learn: In-built installed with anaconda
Pandas: In-built installed with anaconda
matplotlib: In-built installed with anaconda
seaborn: In-built installed with anaconda
jupyter notebook: In-built installed with anaconda
jupyter lab: In-built installed with anaconda
-
In order to start jupyter notebook execute the given command on cmd/terminal
$ jupyter notebook
Python and pip setup
- Python 3 is already installed on linux OS
- Install pip for Linux from here
Command for installing dependencies
numpy: $ sudo pip install numpy
scipy: $ sudo pip install scipy
scikit-learn: $ sudo pip install -U scikit-learn
Pandas: $ sudo pip install pandas
matplotlib:
$ sudo apt-get install libfreetype6-dev libpng-dev
$ sudo pip install matplotlib
seaborn: $ sudo pip install seaborn
jupyter notebook: $ sudo apt-get -y install ipython ipython-notebook
$ sudo -H pip install jupyter
jupyter lab $ sudo pip install jupyterlab
-
For section 1: use
Understand_LR.ipynb
ipython notebook -
For section 2: use
LR_from_Scratch.ipynb
ipython notebook -
For section 3: use
Spam_ham_Classifier.ipynb
ipython notebook
Thanks IIT-Bombay analytics club for hosting this event.