An end-to-end algorithmic trading data pipeline for the Steam Community Market. This system ingests live order book depth, fetches historical price trends, resolves mixed-granularity time-series data, and engineers mathematical features for machine learning forecasting.
/scraper(Data Engineering): bypasses Akamai protections to continuously extract real-time market depth and securely fetch authenticated historical USD baselines./ml(Feature Engineering & Modeling): transforms raw datasets into time-aware ML features (Hype Decay, Volume Momentum, Temporal Seasonality) and prepares them for tree-based forecasting algorithms (XGBoost)./data(Storage): local storage for raw Bronze-layer CSVs and engineered Silver-layer master datasets (Git-ignored).
- Live Order Book Scraping: real-time extraction of market depth (top 5 levels of supply and demand).
- Historical Data Fetching: downloads hourly/daily price and sales volume history.
- Dynamic Currency Conversion: automatically calculates the internal Steam cross-rate (e.g., UAH/USD) using anchor items to isolate data from macroeconomic fluctuations.
- Anti-Ban Architecture: bypasses Akamai protection using
curl_cffiand implements randomized request delays (jitter).
scraper.py- The main daemon for continuous live data collection (defaults to 10-minute intervals).fetch_history.py- A utility for one-time extraction of historical data into CSV format.preprocess.py- The feature engineering script that transforms raw CSVs into an ML-ready master dataset by calculating predictive indicators.requirements.txt- Python dependencies.
- Clone the repository:
git clone [https://github.com/seanans/SteamQuant.git](https://github.com/seanans/SteamQuant.git) cd SteamQuant - Install dependencies:
pip install -r requirements.txt
- Create
.envfile and add your secrets.
STEAM_LOGIN_SECURESTEAM_SESSION_IDSTEAM_COUNTRY
- Data Ingestion (Bronze Layer)
- To run the live order book daemon:
cd scraper python scraper.py - To fetch historical baselines:
cd scraper python fetch_history.py
- Machine Learning Preprocessing (Silver Layer)
- To resolve time-granularity and generate the master ML dataset:
cd ml python preprocess.py
This project interacts with undocumented Steam API endpoints. It is strictly recommended to use a secondary (smurf) account to prevent Community Bans on your primary profile.