This repository contains a series of projects exploring the applications of machine learning techniques in financial data analysis and investment strategy development. The projects involve predictive modeling, classification, time series forecasting, and portfolio optimization using real-world and simulated financial data.
- Problem Set 1 (PS1)
- Problem Set 2 (PS2)
- Problem Set 3 (PS3)
- Final Project
- Machine Learning Methods Used
- Libraries Used
- Running the Code
Objective:
Predict stock returns using regression methods and evaluate model performance.
Techniques:
- Linear Regression: Basic modeling of returns.
- Ridge Regression: Regularization to prevent overfitting.
- Lasso Regression: Variable selection and regularization.
Key Concepts:
- Model selection based on predictive accuracy.
- Bias-variance tradeoff analysis.
📂 See folder: ps1
Objective:
Build classifiers to predict whether a stock's future return will be positive or negative.
Techniques:
- Logistic Regression: Baseline classification method.
- k-Nearest Neighbors (k-NN): Non-parametric classification.
- Decision Trees: Simple tree-based modeling.
- Random Forests: Ensemble learning for better generalization.
Key Concepts:
- Classification accuracy, confusion matrices.
- Model complexity vs performance.
📂 See folder: ps2
Objective:
Model and forecast stock return volatility using time series methods.
Techniques:
- Autoregressive Models (AR)
- Moving Average Models (MA)
- ARMA / ARIMA Models: For capturing complex dependencies.
- GARCH Models: For volatility clustering and dynamic variance modeling.
Key Concepts:
- Time series stationarity and differencing.
- Volatility forecasting.
📂 See folder: ps3
Objective:
Develop a factor-based investment strategy using advanced machine learning techniques and evaluate its performance on real-world data.
Techniques:
- Linear Regression with Feature Engineering: Create predictors based on financial factors.
- Ensemble Methods (Bagging, Boosting): Improve predictive performance.
- Sharpe Ratio and Return Metrics: Evaluate investment profitability and risk-adjusted returns.
Key Concepts:
- Model interpretability.
- Backtesting and out-of-sample evaluation.
📂 See folder: project
-
Regression Models:
Linear Regression, Ridge Regression, Lasso Regression -
Classification Models:
Logistic Regression, k-Nearest Neighbors (k-NN), Decision Trees, Random Forests -
Time Series Models:
AR, MA, ARMA, ARIMA, GARCH -
Ensemble Methods:
Bagging, Boosting -
Evaluation Metrics:
Mean Squared Error (MSE), Accuracy, Sharpe Ratio, Return Analysis
- numpy
- pandas
- matplotlib
- seaborn
- scikit-learn
- statsmodels
- arch (for GARCH models)
- scipy
Each subproject (ps1
, ps2
, ps3
, project
) contains its own Python scripts or Jupyter notebooks.
To run a project:
-
Clone the repository:
git clone https://github.com/damlakayikci/Financial-Applications-of-Machine-Learning.git cd Financial-Applications-of-Machine-Learning
-
Install the necessary packages:
pip install numpy pandas matplotlib seaborn scikit-learn statsmodels arch scipy
-
Navigate to the desired subproject folder:
cd ps1 python ps1_solution.py
Note: Some subprojects are implemented as Jupyter notebooks (
.ipynb
). You can open them using:jupyter notebook