Skip to content

Lab sessions taught by me for MAT-494-Deep Learning (Monsoon-2020) at Shiv Nadar University, India.

Notifications You must be signed in to change notification settings

Sid-darthvader/MAT-494-Deep-Learning-SNU

Repository files navigation

MAT-494-Deep-Learning-SNU

Summaries of Lab sessions conducted by me as a Teaching Assistant for MAT-494-Deep Learning (Monsoon-2020) at Shiv Nadar University, India. Please click on the respective lab title to access the code and datasets used.

Demonstrated solving a real world Business problem by construction of different types of Linear Models starting from Univariate Linear, Quadratic, Cubic models and then moved on to adding some interaction terms before finally settling upon a Multiple Linear Regression Model. Commonly used metrics to judge a Regression models performance were also discussed.

Did a quick recap of the most popular and commonly used Machine Learning Classifers and covered the metrics used to judge a Classifiers performance for both balanced and imbalanced classification. These Classifiers were then used to solve an Imbalanced Class problem of predicting a Wines Quality from a real world dataset containing details about Red Wines from the Northwest Portugal.

Building on the learnings from the previous lab on the Red Wine dataset, this time a Dataset comprising of White wines was used. The procedure of constructing a basic Neural Network architecture (ANN Classifier) was demonstrated using Keras in Python. Finally, its performance was also compared with other popular tree-based ensemble classifers like Random Forests and Gradient Boosting (XGBoost).

Important aspects of buidling good predictive models such as Exploratory Data Analysis(EDA) and Feature Engineering were demonstrated using two Hotel Booking datasets in Python. The focus of this lab was on improving model accuracies through dimensionality reduction using EDA and Feature Engineering. In the end, some basic Neural Network classifers(ANNs) were created to predict whether a user would cancel his booking or not.

The two Hotel Booking datasets of Resort and City Hotels were merged into a single dataset. It was then used to demonstrate two tasks:-

  • Classification: To predict whether a new user would could cancel his booking or not
  • Regression: To predict the price of a booked Hotel room on a particular day.

Findings from EDA and newly engineered features from the previous lab were used to build complicated Neural Network architectures using the neuralnet package in R. The focus of this lab was more on building better ANNs with correct choices of layers, neurons, activation functions and other hyperparameters.

The original research paper on General Regression Neural Networks (GRNNs), "a one-shot learning algorithm which does not assume any functional form for the underlying regression surface and solves the problem of getting stuck at local minimas" was explained followed by its implementation in R. The use of GRNNs was motivated by taking a very small dataset comprising of merely 71 observations in order to predict the Bodyfat of a person. It was shown that in scenarios where traditional ANNs and ML methods don't perform well, GRNNs can be used to perform regression tasks even with extremely small sample sizes.

Gave a summary of the original paper on Probabilistic Neural Networks (PNNs). The implementation of PNNs was covered in R using the PIMAINDIANS-DIABETES dataset to build a binary classifier for predicting whether a pregnant woman is diabetic or not.

  • Gave a quick summary of the theory behind Time Series & Forecasting with it's real world applications (see presentation). Traditional Forecasting methods like AR,MA,ARMA,ARIMA,ARCH & GARCH were explained with their implementations on a univariate time-series data in python.
  • Explained how engineer new features and convert traditional time series forecasting problems into supervised Machine Learning problems. The results obtained by the models built in the previous step were now compared with ML models like Linear Regression,XGBoost, Random Forests etc.

Two types of Recurrent Neural Networks that work well even with less data viz. Elman RNNs & Jordan RNNs were implemented in R to forecast the daily number of deaths in the UK from lung diseases.

Limitations of Linear Dimensionality Reduction methods such as PCA were discussed and the use of Non-Linear Dimensionality Reduction was motivated. The theory behind 2 such methods- Autoencoders & Stacked Autoencoders was discussed and their implementation in R was demonstrated using the Abalone dataset.

Student Projects:-

About

Lab sessions taught by me for MAT-494-Deep Learning (Monsoon-2020) at Shiv Nadar University, India.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published