This is a collection of artefacts demonstrating my skills in data analysis and machine learning. I used various tools within Python, R and Excel; the .pdf reports were compiled from LaTEX.
Some of the source code is included within the reports, but most of it is contained in private repositories in order to prevent plagiarism as prohibited by UNSW. I will eventually remove all references to UNSW course names so that I can publish my work in its entirety. In the mean time, if you would like to have access to any source code, please email me at [email protected].
Python, Pandas, Numpy, Matplotlib
- Cleaned and combined datasets from Victorian government, Queensland government and ABS.
R, RMarkdown
-
Exploratory analysis and data correction
-
Classification using discriminant analysis and support vector machines
-
Multiple linear regression
MS Excel
-
Decision tree
-
SMART analysis
-
MCDM (multi-criteria decision making) for infrastructure project prioritisation
R
-
Exploratory analysis
-
Linear Regression with confirmatory hypothesis test
MS Excel with Solver add-on
-
Optimisation of manufacturing cost
-
Optimisation of staff roster
-
Prescriptive analysis for government school funding allocations
Python, Scikit-learn, Matplotlib
-
Exploration and cleaning
-
Classification using neural networks using different solvers and optimisation strategies
-
Model assessment and selection using confidence intervals, ROC (receiver operating characteristic) curve, confusion matrix
Python, Pandas, Numpy, Matplotlib, Scikit-learn
-
Exploratory analysis and cleaning
-
Linear regression by gradient descent, implemented from scratch, including normalisation strategy
-
Model assessment with error metrics (RMSE, R-squared, etc)
-
Feature selection