Skip to content

aicheye/wundernn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wundernn Challenge Solution

Overview

This repository contains my solution for the Wundernn Challenge, a machine learning competition focused on predicting future market states from historical sequences. The challenge involves building a model that can forecast the next state vector in a sequence based on past observations.

Challenge Details

Objective

Predict the next market state vector based on a sequence of historical states using sequence modeling techniques.

Key Constraints

  • Sequence Length: Each sequence is exactly 1000 steps
  • Warm-up Period: First 100 steps (0-99) are for context building only
  • Scoring Range: Predictions are evaluated on steps 100-998
  • Evaluation Metric: R² (coefficient of determination) score
  • Data Structure: N anonymized numeric features describing market states
  • Independence: Each sequence is independent and must reset the model's internal state

Solution Architecture

Model: Transformer-based Sequence Predictor

The solution uses an advanced Transformer architecture with the following components:

Key Features:

  • Advanced Positional Encoding: Combines sinusoidal and learnable positional embeddings
  • Multi-Head Self-Attention: Captures complex dependencies in market sequences
  • Position-wise Attention: Adaptively weights position importance
  • Residual Connections: Improves gradient flow during training
  • Layer Normalization: Stabilizes training process
  • Dropout Regularization: Prevents overfitting

Architecture Highlights:

  • Handles variable-length sequences up to 1000 steps
  • Efficient attention mechanisms for long sequence processing
  • Adaptive learning mechanisms for position encoding

Results

Overall Performance

Metric Value
Mean R² Score 0.396
Best Performing Feature Feature 7 (R² = 0.536)
Worst Performing Feature Feature 21 (R² = 0.259)
Total Features 32
Model Type Transformer

Top Performing Features

Feature R² Score
7 0.536
14 0.521
6 0.519
13 0.502
11 0.501

Training Progress

Training and Validation Loss

The loss curve shows steady improvement with well-behaved convergence. The model achieves minimal overfitting with validation loss stabilizing after approximately 20 epochs.

Predictions Analysis

Predictions Visualization

The predictions visualization demonstrates the model's ability to capture market state trends and variations.

Residuals Analysis

Residuals Distribution

The residuals plot shows the model's prediction errors across the validation set, indicating relatively balanced performance across different market states.

Model Advantages

  1. Scalability: Transformer architecture scales well with sequence length
  2. Parallelization: Self-attention allows for efficient parallel processing
  3. Long-Range Dependencies: Multi-head attention captures complex patterns across the entire sequence
  4. Adaptability: Learnable positional encodings adjust to specific market dynamics
  5. Regularization: Built-in mechanisms prevent overfitting on training data

Data Handling Strategy

Validation Set Creation

  • Split sequences by seq_ix to ensure independence
  • 80/20 train/validation split
  • Proper sequence isolation to avoid data leakage

Preprocessing

  • Standardization of market state features
  • Proper handling of sequence boundaries
  • Internal state reset for each new sequence

Training Configuration

  • Optimizer: Adam with adaptive learning rate
  • Loss Function: Mean Squared Error (MSE)
  • Batch Size: Optimized for GPU memory utilization
  • Epochs: Trained until convergence with early stopping
  • Device: GPU acceleration enabled

Files Structure

├── src/
│   └── solution.py           # Main model implementation
├── results/
│   ├── loss_curve.png        # Training/validation loss history
│   ├── predictions.png       # Prediction visualization
│   ├── residuals.png         # Residuals analysis
│   ├── feature_scores.csv    # Per-feature R² scores
│   ├── loss_history.csv      # Epoch-by-epoch loss history
│   ├── summary.csv           # Overall performance summary
│   └── solution_submission.zip
├── utils.py                  # Utility functions and DataPoint class
├── model_training.ipynb      # Training notebook and analysis
└── README.md                 # This file

Getting Started

Requirements

  • Python 3.8+
  • PyTorch
  • NumPy
  • Pandas
  • scikit-learn

Running the Solution

from src.solution import PredictionModel
import numpy as np

# Initialize model
model = PredictionModel()

# Make predictions on data points
prediction = model.predict(data_point)

Key Insights

  1. Feature Heterogeneity: Different features show varied predictability, with feature 7 being most predictable (R² = 0.536) and feature 21 being most challenging (R² = 0.259)

  2. Model Convergence: The training curve shows stable convergence without significant overfitting, indicating good generalization

  3. Sequence Context: The model effectively uses the warm-up period (steps 0-99) to build contextual representations for accurate future predictions

Submission

The solution is packaged as solution_submission.zip containing:

  • solution.py with the PredictionModel class
  • Proper implementation of the predict(data_point: DataPoint) method
  • All required dependencies properly imported

Future Improvements

  1. Ensemble Methods: Combine multiple model architectures
  2. Feature Engineering: Derive additional features from raw sequences
  3. Hyperparameter Tuning: Further optimize learning rates and architecture parameters
  4. Advanced Architectures: Explore state-of-the-art models like Mamba-2
  5. Temporal Augmentation: Apply data augmentation techniques for sequence data

Conclusion

This solution demonstrates the effectiveness of Transformer-based architectures for market state prediction tasks. The model successfully captures temporal dependencies in market sequences and achieves meaningful predictive performance across multiple features.

Releases

No releases published

Packages

No packages published