Skip to content

This repository contains the code for building Logistic Regression from scratch

Notifications You must be signed in to change notification settings

jalajthanaki/Simplify_Logistic_Regression

Repository files navigation

Simplify - Logistic Regression

This repository contains the code for building Logistic Regression from scratch.

Workshop Outline

Section 1: Understanding - Logistic Regression

        1.1 Intorduction to Machine Learning

        1.2 Intuition behind the Logistic Regression (LR)

        1.3 Math behind the LR

        1.4 Summary

Section 2: Building - Logistic Regression from scratch

 Section 2.1: Apply LR using API of Scikit-Learn 

              Step 2.1.1: Import dependencies

              Step 2.1.2: Load data

              Step 2.1.3: Preprocessing

              Step 2.1.4: Split data set into training and testing
        
              Step 2.1.5: Train model using Scikit-learn's Logistic Regression
        
              Step 2.1.6: Visualize the Dataset


 Section 2.2: Apply LR from scratch

              Step 2.2.1: Define sigmoid or logistic function
        
              Step 2.2.2: Define hypothesis function
        
              Step 2.2.3: Define cost function 
        
              Step 2.2.4: Define cost function derivative as well as calculating error
        
              Step 2.2.5: Get the updated theta value by calculating gradient descent
        
              Step 2.2.6: Define helper function to run Logistic Regression
        
              Step 2.2.7: Compare our Implementation with Scikit-Learn API
        
              Step 2.2.8: Run our own LR implementation

Section 3: Classify emails into Spam and Ham

 Step 1: Import dependencies

 Step 2: Load data

 Step 3: Exploratory Data Analysis

 Step 4: Transforming labels

 Step 5: Split data set into training and testing

 Step 6: Generating Features using CountVectorizer

 Step 7: Train model

 Step 8: Test model 

        Step 8.1: Measure Accuracy

        Step 8.2: Confusion Matrix 

        Step 8.3 Area Under Curve

Dependencies

  • Python 3.3+
  • math
  • numpy
  • pandas
  • scikit-learn
  • matplotlib
  • seaborn
  • jupyter notebook

Installation Instructions

Windows OS

Install Python3 (install python 3.6.4)

Install anaconda

  • Step 1: Download Anaconda 5.1 (python 3.6 version) using this link

  • Step 2: See the installation instruction given on this link

Note: If you have any other version of python then install anaconda which supports that particular version of python

Install dependencies using conda

    numpy:            In-built installed with anaconda
    scipy:            In-built installed with anaconda
    scikit-learn:     In-built installed with anaconda
    Pandas:           In-built installed with anaconda
    matplotlib:       In-built installed with anaconda 
    seaborn:          In-built installed with anaconda
    jupyter notebook: In-built installed with anaconda
    jupyter lab:      In-built installed with anaconda
  • In order to start jupyter notebook execute the given command on cmd/terminal

    $ jupyter notebook

Linux OS

Python and pip setup

  • Python 3 is already installed on linux OS
  • Install pip for Linux from here

Command for installing dependencies

numpy:            $ sudo pip install numpy
scipy:            $ sudo pip install scipy
scikit-learn:     $ sudo pip install -U scikit-learn
Pandas:           $ sudo pip install pandas
matplotlib: 
                  $ sudo apt-get install libfreetype6-dev libpng-dev
                  $ sudo pip install matplotlib 
seaborn:          $ sudo pip install seaborn
jupyter notebook: $ sudo apt-get -y install ipython ipython-notebook
                  $ sudo -H pip install jupyter
jupyter lab       $ sudo pip install jupyterlab

Usage

  • For section 1: use Understand_LR.ipynb ipython notebook

  • For section 2: use LR_from_Scratch.ipynb ipython notebook

  • For section 3: use Spam_ham_Classifier.ipynb ipython notebook

Special Thanks

Thanks IIT-Bombay analytics club for hosting this event.

About

This repository contains the code for building Logistic Regression from scratch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published