Skip to content

Varshith-Yadav/StockMarket_Forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📈 Stock Market Time Series Forecasting with LSTM & Transformer (PyTorch)

Overview

This project implements an end-to-end deep learning pipeline for stock market time series forecasting using LSTM and Transformer models in PyTorch.

The goal is not trading or profit prediction, but to:

  • Model temporal dependencies in financial time series
  • Compare deep learning models against strong baselines
  • Analyze model behavior, limitations, and failure modes

This project follows FAANG-level ML engineering practices, including time-aware splits, baselines, evaluation, and error analysis.


Problem Statement

Given historical stock price data (OHLCV), predict the next-day closing price using past observations.

Key constraints:

  • Time-aware training (no data leakage)
  • Fair comparison with naïve baselines
  • Focus on model reliability and interpretability

Dataset

  • Source: Yahoo Finance (yfinance)
  • Stock: AAPL (Apple Inc.)
  • Frequency: Daily
  • Time Range: 2015-01-01 → 2024-12-31

Features Used

  • Open
  • High
  • Low
  • Close
  • Volume
  • Daily return
  • Log return

Target

  • Next-day closing price

Methodology

1. Exploratory Data Analysis (EDA)

  • Verified time continuity and data integrity
  • Identified non-stationarity and volatility regimes
  • Observed trend, seasonality, and regime shifts

2. Preprocessing

  • Engineered returns and log-returns to stabilize variance
  • Time-aware train/validation/test split:
    • Train: 2015–2020
    • Validation: 2021–2022
    • Test: 2023–2024
  • Feature scaling using StandardScaler (fit on train only)

3. Sliding Window Formulation

Converted the time series into supervised learning format:

  • Input window: 60 trading days
  • Forecast horizon: 1 day
  • X → (batch, 60, num_features)
  • y → (batch, 1)

Baseline Models

Strong baselines were implemented to justify model complexity:

  1. Naive Forecast
    • Predicts last observed value
  2. Moving Average Forecast
    • Mean of last 5 timesteps

These baselines provide a realistic lower bound for performance.


Deep Learning Models

LSTM

  • 2-layer LSTM
  • Hidden size: 64
  • Dropout for regularization
  • Uses final hidden state for prediction

Motivation:
LSTMs handle medium-range temporal dependencies and mitigate vanishing gradients.


Transformer

  • Encoder-only Transformer
  • Positional encoding
  • Multi-head self-attention
  • Causal sequence modeling

Motivation:
Transformers capture long-range dependencies via attention, but require more data and regularization.


Training Setup

  • Framework: PyTorch
  • Optimizer: Adam
  • Loss: Mean Squared Error (MSE)
  • Early stopping via validation monitoring
  • GPU support when available

Evaluation Metrics

  • Mean Absolute Error (MAE)
  • Root Mean Squared Error (RMSE)

Evaluation Strategy

  • Same test set for baselines and DL models
  • Visual comparison of predictions vs actuals
  • Residual analysis during high-volatility periods

Key Results (Typical Observation)

  • Naive baseline performs strongly during stable periods
  • LSTM improves consistency during trending regimes
  • Transformer shows potential for longer horizons but is data-hungry
  • All models struggle during extreme volatility (earnings, macro events)

Error Analysis

  • Large errors correlate with:
    • Earnings announcements
    • Market regime shifts
    • Sudden volatility spikes
  • Highlights inherent uncertainty in financial time series

Limitations

  • Stock prices are inherently noisy and non-stationary
  • No external features (news, macro indicators)
  • Single-stock modeling (extendable to panel data)

Future Work

  • Multi-step forecasting
  • Multi-stock panel forecasting with symbol embeddings
  • Probabilistic forecasting (prediction intervals)
  • Volatility-aware loss functions

Disclaimer

This project is for educational and research purposes only.
It is not intended for financial trading or investment decisions.

About

This project implements an end-to-end deep learning pipeline for stock market time series forecasting using LSTM and Transformer models in PyTorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors