Train RL agents to trade. Can they beat Buy-and-Hold?
TensorTrade is an open-source Python framework for building, training, and evaluating reinforcement learning agents for algorithmic trading. The framework provides composable components for environments, action schemes, reward functions, and data feeds that can be combined to create custom trading systems.
# Requires Python 3.12+
python3.12 -m venv tensortrade-env && source tensortrade-env/bin/activate
pip install -e .
# For training with Ray/RLlib (recommended)
pip install -r examples/requirements.txt
# Run training
python examples/training/train_simple.pyπ Tutorial Index β Start here for the complete learning curriculum.
- The Three Pillars β RL + Trading + Data concepts
- Architecture β How components work together
- Your First Run β Run and understand output
- Trading for RL Practitioners
- RL for Traders
- Common Failures β Critical pitfalls to avoid
- Full Introduction β New to both domains
- Action Schemes β BSH and order execution
- Reward Schemes β Why PBR works
- Observers & Feeds β Feature engineering
- First Training β Train with Ray RLlib
- Ray RLlib Deep Dive β Configuration options
- Optuna Optimization β Hyperparameter tuning
- Overfitting β Detection and prevention
- Commission Analysis β Key research findings
- Walk-Forward Validation β Proper evaluation
- Experiments Log β Full research documentation
- Environment Setup β Detailed installation guide
- API Reference
We conducted extensive experiments training PPO agents on BTC/USD. Key results:
| Configuration | Test P&L | vs Buy-and-Hold |
|---|---|---|
| Agent (0% commission) | +$239 | +$594 |
| Agent (0.1% commission) | -$650 | -$295 |
| Buy-and-Hold | -$355 | β |
The agent demonstrates directional prediction capability at zero commission. The primary challenge is trading frequencyβcommission costs currently exceed prediction profits. See EXPERIMENTS.md for methodology and detailed analysis.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TradingEnv β
β β
β Observer ββββββ> Agent ββββββ> ActionScheme ββββββ> Portfolio β
β (features) (policy) (BSH/Orders) (wallets) β
β ^ β β
β βββββββββββββ RewardScheme <ββββββββββββββββββββββββ β
β (PBR) β
β β
β DataFeed ββββββ> Exchange ββββββ> Broker ββββββ> Trades β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Component | Purpose | Default |
|---|---|---|
| ActionScheme | Converts agent output to orders | BSH (Buy/Sell/Hold) |
| RewardScheme | Computes learning signal | PBR (Position-Based Returns) |
| Observer | Generates observations | Windowed features |
| Portfolio | Manages wallets and positions | USD + BTC |
| Exchange | Simulates execution | Configurable commission |
| Script | Description |
|---|---|
examples/training/train_simple.py |
Basic demo with wallet tracking |
examples/training/train_ray_long.py |
Distributed training with Ray RLlib |
examples/training/train_optuna.py |
Hyperparameter optimization |
examples/training/train_best.py |
Best configuration from experiments |
Requirements: Python 3.11 or 3.12
# Create environment
python3.12 -m venv tensortrade-env
source tensortrade-env/bin/activate # Windows: tensortrade-env\Scripts\activate
# Install
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .
# Verify
pytest tests/tensortrade/unit -v
# Training dependencies (optional)
pip install -r examples/requirements.txtSee ENVIRONMENT_SETUP.md for platform-specific instructions and troubleshooting.
make run-notebook # Jupyter
make run-docs # Documentation
make run-tests # Test suitetensortrade/
βββ tensortrade/ # Core library
β βββ env/ # Trading environments
β βββ feed/ # Data pipeline
β βββ oms/ # Order management
β βββ data/ # Data fetching
βββ examples/
β βββ training/ # Training scripts
β βββ notebooks/ # Jupyter tutorials
βββ docs/
β βββ tutorials/ # Learning curriculum
β βββ EXPERIMENTS.md # Research log
βββ tests/
| Issue | Solution |
|---|---|
| "No stream satisfies selector" | Update to v1.0.4-dev1+ |
| Ray installation fails | Run pip install --upgrade pip first |
| NumPy version conflict | pip install "numpy>=1.26.4,<2.0" |
| TensorFlow CUDA issues | pip install tensorflow[and-cuda]>=2.15.1 |
See CONTRIBUTING.md for guidelines.
Priority areas:
- Trading frequency reduction (position sizing, holding periods)
- Commission-aware reward schemes
- Alternative action spaces
