Collect historical price data for all 118 GMX V2 tokens with smart incremental updates.
- β Smart Data Range Checking - Only fetches data you don't have
- β Incremental Updates - 10-30x faster than full collection for daily updates
- β Dual Data Sources - GMX API (recent ~6 months) + Chainlink (historical)
- β 118 GMX V2 Tokens - 34 Chainlink markets + 84 non-Chainlink markets
- β 6 Timeframes - 1min, 5min, 15min, 1h, 4h, 1d
- β Freqtrade Compatible - Direct export to Feather format (OHLCV + funding rate + mark price)
- β Gap Detection - Identifies data loss from GMX API's sliding window
- β Concurrent Collection - Parallel symbol processing and RPC batch requests
# Install
poetry install
# Set environment variables
export JSON_RPC_ARBITRUM="https://arb-mainnet.g.alchemy.com/v2/YOUR_KEY"
export HYPERSYNC_API_TOKEN="your_token_here" # Free from https://envio.dev
# Initial collection (takes 4-5 hours for all tokens)
gmx_historical_data collect --default --output-dir ./data --concurrency 5
# Chainlink-only collection (34 markets, no HyperSync needed)
gmx_historical_data collect --full --chainlink-only --output-dir ./data --concurrency 5
# Daily updates (takes 10-30 minutes for all tokens - 10-30x faster!)
gmx_historical_data collect --update --output-dir ./data --concurrency 10The collection system intelligently checks your existing data before fetching:
graph TD
A[Run --update] --> B{Check existing data}
B -->|No data| C[Fall back to --full mode]
B -->|Data is current| D[Skip all fetching β]
B -->|Gap exists| E[Fetch only the gap]
E --> F[Merge with existing data]
F --> G[Deduplicate by timestamp]
G --> H[Save to storage]
Example: Daily Update for ETH
# Check existing data
β 1h timeframe: Latest timestamp 2026-01-27 15:00:00
# Calculate gap
β Need data from 2026-01-27 16:00:00 to now (24 candles)
# Fetch incrementally
β 1h: Fetching from GMX API (incremental)
β 1h: 24 candles from GMX
# Merge and save
β 1h: Merged 8,760 total candles (added 24 new)
β Chainlink backfill not needed - data is complete
# Total time: 15 seconds (vs 5+ minutes for --full)
Key Optimizations:
- Skip current data: If your data is up-to-date, fetching is skipped entirely
- Fetch only gaps: Only requests data from your latest timestamp to now
- Smart Chainlink backfill: Only backfills if you need older historical data
- Efficient merging: Concatenates and deduplicates in-memory before saving
For the 84 non-Chainlink tokens, the collector now uses incremental oracle events collection to dramatically reduce bandwidth and collection time.
- Data Coverage Analysis - Analyzes existing parquet files to determine what oracle events are missing
- Block-Timestamp Cache - Efficiently converts timestamps to block numbers without RPC calls
- Gap-Only Fetching - Only fetches oracle events for missing block ranges (50-99% bandwidth savings)
- HyperSync Key Rotation - Automatically rotates between multiple API keys to handle rate limits
- Progressive Retry - Exponential backoff for transient errors (2s β 60s delays)
# Set up multiple HyperSync keys for rate limit protection
export HYPERSYNC_API_TOKEN="key1 key2 key3" # Space-separated, free from https://envio.dev
# First run: Full collection from genesis
gmx_historical_data collect --symbol SUI --output-dir ./data
# Output:
# β Building block-timestamp cache...
# β Cache built with 30,000 samples
# β No existing data found for SUI
# β Fetching oracle events: blocks 180,000,000 to 210,000,000 (30M blocks)
# β Collected 50,000 oracle events (~10 minutes)
# Second run: Incremental update (only new data)
gmx_historical_data collect --symbol SUI --output-dir ./data
# Output:
# β Using cached block-timestamp data
# β Analyzing existing coverage...
# β Found data covering 6 timeframes (earliest: 2024-10-01)
# β Fetching oracle events: blocks 180,000,000 to 195,001,000 (15M blocks)
# β Collected 25,000 new oracle events (~5 minutes - 50% faster!)# Configure multiple keys
export HYPERSYNC_API_TOKEN="key1 key2 key3"
# Automatic rotation on rate limits
gmx_historical_data collect --default --output-dir ./data --concurrency 10
# Logs show rotation:
# [INFO] Initialized HyperSyncKeyRotator with 3 API key(s)
# [WARN] Rate limit detected on key #1 - rotating to key #2
# [SUCCESS] Collection succeeded after key rotation| Scenario | Before (Full) | After (Incremental) | Improvement |
|---|---|---|---|
| Daily update (1 symbol) | ~10 min, 30M blocks | ~5 min, 15M blocks | 50% faster, 50% less bandwidth |
| Hourly update (1 symbol) | ~10 min, 30M blocks | ~1 min, 500k blocks | 90% faster, 98% less bandwidth |
| Daily update (84 symbols) | ~14 hours | ~1-2 hours | 85-90% faster |
| Bandwidth (daily) | ~500 MB/symbol | ~8 MB/symbol | 98% reduction |
For detailed guide including troubleshooting, advanced configuration, and FAQs, see:
π Incremental Collection Guide
Extract historical funding rates for all GMX V2 perpetual markets. Funding rates
use 30-decimal fixed-point precision (fundingFactorPerSecond / 10^30).
GMX V2 changed how it exposes funding rates in the V2.2 upgrade (August 2025):
-
Pre-V2.2 (Nov 2023 β Aug 2025): GMX stored
savedFundingFactorPerSecondas a signed integer in the DataStore contract (state, not events). No on-chain event ever emitted the rate for this period. The only way to read historical values is viaeth_callto the DataStore at specific past blocks β which requires an archive node. HyperSync cannot help here because it only indexes event logs. -
V2.2+ (Aug 2025 β present): GMX introduced the
Fundingevent on theEventEmittercontract, which emitsfundingFactorPerSecondon every update. HyperSync can stream these events all the way back to V2.2 genesis, with no RPC needed. -
FundingFeeAmountPerSizeUpdatedevents exist across the full V2 history, but they carry cumulative fee accumulators β not the rate itself. They include open-interest amplification (~13Γ) and token pricing, making direct rate extraction impractical. We use them solely to detect direction (which side pays).
| Script | Period | Data | Source |
|---|---|---|---|
extract_funding_datastore.py |
Nov 2023 - Aug 2025 | Signed rate (correct direction) | Archive RPC + batched bytecode eth_call |
extract_funding_factor.py |
Aug 2025 - present | Rate magnitude (unsigned) | HyperSync events |
extract_funding_fee_per_size.py |
Aug 2023 - present | Direction (who pays) | HyperSync events |
extract_unified_funding.py |
Combined | Direction-corrected rate | All sources |
The unified script orchestrates all sources and merges into a single
direction-corrected 1h.parquet per symbol.
# Install dependencies
poetry install
# Default: HyperSync phases + merge (fast, no archive RPC needed)
poetry run python scripts/extract_unified_funding.py
# Include pre-V2.2 DataStore phase (~200 batched HTTP requests, requires archive node)
export JSON_RPC_ARBITRUM=<archive-node-url>
poetry run python scripts/extract_unified_funding.py --include-datastore
# Single market
poetry run python scripts/extract_unified_funding.py --market ETH/USD
# Incremental update (resumes from per-phase checkpoints)
poetry run python scripts/extract_unified_funding.py --resume
# Output as feather instead of parquet
poetry run python scripts/extract_unified_funding.py --output feather
# Export to FreqTrade feather format (OHLCV with rate in 'open')
poetry run python scripts/extract_unified_funding.py --feather-dir ./user_data/data
# Both: feather merge + FreqTrade export
poetry run python scripts/extract_unified_funding.py --output feather --feather-dir ./user_data/data
# Merge only (skip extraction, combine existing data)
poetry run python scripts/extract_unified_funding.py --merge-only# Full extraction (HyperSync + merge)
make funding-unified
# With DataStore phase (slow)
make funding-unified INCLUDE_DATASTORE=1
# Incremental update
make funding-unified-resume
# Single market
make funding-unified MARKET="ETH/USD"
# Merge only
make funding-unified-merge- Phase 2 β Funding Factor (HyperSync): Streams
Fundingevents forfundingFactorPerSecondmagnitude (V2.2+, Aug 2025 onwards) - Phase 3 β Direction (HyperSync): Streams
FundingFeeAmountPerSizeUpdatedevents, compares long vs short delta sums to determine who pays - Phase 1 β DataStore (opt-in): Reads signed
savedFundingFactorPerSecondfrom the DataStore contract via batchedeth_callat hourly intervals (pre-V2.2) - Merge: Combines all sources, applies direction correction to V2.2+ data,
deduplicates, and writes unified
1h.parquet(or1h.featherwith--output feather)
The DataStore phase uses fully-batched JSON-RPC requests across two prefetch phases β all network I/O finishes before any record is written:
| Phase | Method | Batch size | Requests for full run |
|---|---|---|---|
| Timestamps | eth_getBlockByNumber batch |
200/request | ~87 HTTP requests |
| DataStore reads | eth_call batch |
150/request | ~116 HTTP requests |
| Total | ~203 HTTP requests (vs. ~17,000 sequential) |
How the eth_call batching works:
Each HTTP request contains up to 150 JSON-RPC eth_call items. Each eth_call executes
GMXFundingRateBatchRequest β a never-deployed Solidity contract whose constructor
reads all ~124 market rates from the DataStore in a single EVM execution (computes
keccak keys on-chain, calls getInt for every market, returns int256[]). The contract
bytecode is embedded directly in the Python script; no deployment or Multicall3 dependency.
1 HTTP POST β 150 Γ eth_call β each call reads all ~124 markets in one EVM run
ββ JSON-RPC batch ββ GMXFundingRateBatchRequest bytecode
Requests are distributed round-robin across all providers in JSON_RPC_ARBITRUM.
# Full run β ~203 HTTP requests, typically 2-5 minutes on an archive node
export JSON_RPC_ARBITRUM="https://rpc1.example.com https://rpc2.example.com"
poetry run python scripts/extract_funding_datastore.py
# Expected output:
# HTTP batches: 87 timestamp + 116 DataStore = 203 total
# Timestamps ready in ~65s
# DataStore reads ready in ~90sArchive node required. Public RPC endpoints (Alchemy free tier, Infura) usually rate-limit
eth_callat historical blocks. A dedicated Arbitrum archive node (e.g. Alchemy Growth, QuickNode) handles sustained parallel load without 429s.
data/funding/arbitrum/
βββ raw/
β βββ funding/{SYMBOL}/partition=0/data.parquet # Raw Funding events
β βββ fee_per_size/{SYMBOL}/data.parquet # Raw fee-per-size events
βββ rates/
β βββ {SYMBOL}/1h.parquet # Unified hourly rates (merged)
β βββ {SYMBOL}/1h_factor.parquet # HyperSync-only rates
β βββ {SYMBOL}/1h_datastore.parquet # DataStore-only rates
βββ direction/
β βββ {SYMBOL}/1h.parquet # Hourly direction (who pays)
βββ checkpoints/
βββ funding_factor_checkpoint.json
βββ fee_per_size_checkpoint.json
βββ funding_datastore_checkpoint.json
| Column | Type | Description |
|---|---|---|
timestamp |
datetime[ms, UTC] | Hour start |
funding_rate |
float64 | Mean per-second rate |
funding_rate_hourly |
float64 | rate * 3600 (used by FreqTrade) |
funding_rate_annualized |
float64 | rate * 3600 * 8760 |
longs_pay_shorts |
bool | True = longs pay (direction-corrected) |
funding_fee_long |
float64 | Signed hourly rate for longs |
funding_fee_short |
float64 | Signed hourly rate for shorts |
update_count |
uint32 | Events in hour |
source |
string | "datastore" or "hypersync" |
| Metric | Value |
|---|---|
| Per-second rate | ~1e-10 to ~1e-9 |
| Hourly rate | ~3.6e-7 to ~3.6e-6 |
| Annualized | ~0.5% to ~10% |
0 2 * * * cd /path/to/gmx_historical_data && \
poetry run python scripts/extract_unified_funding.py --resume >> logs/unified_funding.log 2>&1The three source scripts can still be run independently:
# Funding Factor only (V2.2+)
poetry run python scripts/extract_funding_factor.py --resume
# Direction only
poetry run python scripts/extract_funding_fee_per_size.py --resume
# DataStore only (pre-V2.2, requires archive RPC)
poetry run python scripts/extract_funding_datastore.py --resumeHyperSync 500 errors: Both HyperSync scripts retry up to 15 times with exponential backoff (capped at 2 minutes). The unified script additionally retries each phase up to 3 times if it fails entirely.
Unknown markets: Market info is fetched dynamically from the GMX REST API and
cached for 24 hours in ~/.cache/gmx_historical_data/markets_arbitrum.json. If a
newly listed market is missing, run with --refresh-markets to force a cache refresh.
Rates look wrong: Verify the raw funding_factor_per_second value.
Divide by 10^30 for the per-second decimal rate. Multiply by 3600 * 8760
for annualized. Typical ETH/USD annual rate is ~2-5%.
Extract the market-level borrowingFactorPerSecond from GMX V2 Borrowing events
using HyperSync. This is the pre-computed on-chain borrowing rate β the rate
charged to the dominant side (larger OI) and paid to LPs.
Why this matters: The Dune Analytics query for GMX borrowing rates uses complex Method 1/Method 2 calculations from pool parameters. The on-chain
Borrowingevent gives the actual rate the protocol uses, which is simpler and more accurate.
# Full historical extraction
poetry run python scripts/extract_borrowing_factor.py --from-block 120000000
# Quick smoke test (recent blocks, JSON output)
poetry run python scripts/extract_borrowing_factor.py \
--from-block 433300000 --to-block 433400000 --output json
# Incremental mode (resumes from checkpoint)
poetry run python scripts/extract_borrowing_factor.py --resume
# Background daemon
poetry run python scripts/extract_borrowing_factor.py --resume --background
# Filter by market
poetry run python scripts/extract_borrowing_factor.py --resume --market "ETH/USD"- Queries HyperSync for
Borrowingevents (topic1 =keccak("Borrowing")) from the GMX V2 EventEmitter contract - Decodes
borrowingFactorPerSecondas a 30-decimal fixed-point integer (divide by10^30for the per-second decimal rate) - Writes raw events to per-symbol Parquet files
- Aggregates to hourly rates with derived columns (hourly, annualized)
data/borrowing/arbitrum/
βββ raw/borrowing/ # Raw Borrowing events
β βββ ETH/partition=0/data.parquet
β βββ BTC/partition=0/data.parquet
β βββ .../
βββ rates/ # Hourly aggregated rates
β βββ ETH/1h.parquet
β βββ BTC/1h.parquet
β βββ .../
βββ checkpoints/
βββ borrowing_factor_checkpoint.json # Resume state
Hourly rates schema (rates/{SYMBOL}/1h.parquet):
| Column | Type | Description |
|---|---|---|
timestamp |
datetime[ms, UTC] | Hour start |
borrowing_rate |
float64 | Mean per-second rate |
borrowing_rate_hourly |
float64 | rate * 3600 |
borrowing_rate_annualized |
float64 | rate * 3600 * 8760 |
update_count |
uint32 | Events in hour |
symbol |
string | Human-readable symbol |
market |
string | Market contract address |
Typical borrowingFactorPerSecond values (annualized):
| Market | Typical Annual Rate |
|---|---|
| BTC/USD | ~1-2% |
| ETH/USD | ~4-7% |
| SOL/USD | ~1-3% |
| Illiquid tokens | ~10-60%+ |
To compute the net cost of holding a position:
net_rate = funding_rate + borrowing_rate
Both scripts output hourly rates in the same format, so they can be joined on
(timestamp, symbol) in pandas/polars.
The easiest way to collect GMX data is using Docker Compose with 3 pre-configured options:
# 1. Copy environment template
cp .env.example .env
# 2. Edit .env with your API keys
nano .env
# 3. Choose your collection option:
# Option 1: Collect everything (all 118 markets)
docker-compose --profile all up gmx-collect-all
# Option 2: Chainlink feed tokens only (34 markets)
docker-compose --profile chainlink up gmx-collect-chainlink-only
# Option 3: Custom symbols (user configurable)
SYMBOLS=ETH,BTC,SUI docker-compose --profile custom up gmx-collect-custom- β No Python/Rust installation needed - Everything runs in containers
- β 3 pre-configured profiles - All markets, Chainlink-only, or custom
- β
Incremental updates -
--profile updatefor daily updates - β
Data verification -
--profile verifyto check data quality - β
Persistent storage - Data and logs saved to
./dataand./logs - β HyperSync key rotation - Automatic rotation for rate limit protection
| Profile | Markets | Duration (First Run) | Use Case |
|---|---|---|---|
all |
118 | 6-8 hours | Complete dataset |
chainlink |
34 | 2-3 hours | Major tokens only |
custom |
User defined | Varies | Specific tokens |
update |
All existing | 10-30 minutes | Daily updates |
verify |
N/A | 1-2 minutes | Quality check |
See Docker Usage Guide for detailed documentation.
Prerequisites: Python 3.11 or 3.12 (recommended), Rust toolchain (for hypersync)
# Install Rust (if needed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Install dependencies
poetry install# Initial collection: All 118 tokens (ETA 4-5 hours)
gmx_historical_data collect --default --output-dir ./data --concurrency 5
# Initial collection: Single token (ETA 2-3 minutes)
gmx_historical_data collect --default --symbol ETH --output-dir ./data
# Daily incremental update: Fast! (ETA 10-30 seconds per token)
gmx_historical_data collect --update --output-dir ./data --concurrency 10
# Verify data quality
gmx_historical_data verify --output-dir ./dataThe --default flag uses --full mode to fetch recent data from GMX API (~6 months) and backfill historical data from Chainlink oracles where available. N.B. There are only around 34 tokens which have the chainlink price feeds as of making this tutorial.
Tip: After initial collection with --default, use --update mode for daily/hourly updates. It's 10-30x faster because it only fetches new data.
Exports OHLCV candles, funding rates, and mark prices in FreqTrade's feather format.
The output format is CCXT-compatible (datetime64[ms, UTC] timestamps, same column
layout as Binance/Hyperliquid).
# Export all data types (OHLCV + funding rate + mark price)
gmx_historical_data export-freqtrade --data-dir ./data --output-dir ./freqtrade_data
# Export specific symbols/timeframes
gmx_historical_data export-freqtrade --data-dir ./data --symbol ETH --symbol BTC --timeframe 1hOutput structure:
freqtrade_data/gmx/futures/
βββ ETH_USDC_USDC-1h-futures.feather # OHLCV candles
βββ ETH_USDC_USDC-1h-funding_rate.feather # Funding rate (open=rate, others=0)
βββ ETH_USDC_USDC-1h-mark.feather # Mark price (OHLCV proxy)
βββ BTC_USDC_USDC-1h-futures.feather
βββ BTC_USDC_USDC-1h-funding_rate.feather
βββ BTC_USDC_USDC-1h-mark.feather
βββ ...
Funding rate data is read from data/funding/arbitrum/rates/{SYMBOL}/{timeframe}.parquet
(generated by collect-funding).
GMX requires gmx-ccxt-freqtrade monkeypatch. Use the included freqtrade-gmx wrapper.
Important: Install
freqtradeandweb3-ethereum-defiin a separate isolated environment. Freqtrade requirespandas<3.0which conflicts with this project'spandas>=3.0. Even thoughweb3-ethereum-defiis a project dependency, a clean install in an isolated venv avoids dependency conflicts.
# Create isolated environment for freqtrade (not .venv - that's for poetry)
python -m venv freqtrade-venv && source freqtrade-venv/bin/activate
pip install "freqtrade>=2025.11" "web3-ethereum-defi[web3v7,ccxt]>=0.38" plotly# Copy example config and customize
cp configs/adxmomentum_gmx.example.json configs/adxmomentum_gmx.json
# Edit configs/adxmomentum_gmx.json with your settingsN.B.: The following step is very essential as we are saving the data as parquet file but freqtrade expects the data as feather file format. This exports OHLCV candles, funding rates, and mark prices.
# Export data to freqtrade format (OHLCV + funding rate + mark price)
gmx_historical_data export-freqtrade --data-dir ./data --output-dir ./user_data/data \
--symbol BTC --symbol ETH --timeframe 1h# Run backtest with GMX support
./freqtrade-gmx backtesting --config configs/adxmomentum_gmx.json \
--strategy ADXMomentum --timerange 20210713-You can keep the timerange blank. Then freqtrade will run the backtest on the highest range of data available.
Result for strategy ADXMomentum
BACKTESTING REPORT
βββββββββββββββββ³βββββββββ³βββββββββββββββ³ββββββββββββββββββ³βββββββββββββββ³ββββββββββββββββββ³βββββββββββββββββββββββ
β β β β β β β Win Draw Loss β
β Pair β Trades β Avg Profit % β Tot Profit USDC β Tot Profit % β Avg Duration β Win% β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β BTC/USDC:USDC β 366 β 0.16 β 8.590 β 8.59 β 2 days, 4:10:00 β 145 0 221 β
β β β β β β β 39.6 β
β ETH/USDC:USDC β 394 β 0.07 β 4.103 β 4.1 β 1 day, 22:11:00 β 170 0 224 β
β β β β β β β 43.1 β
β TOTAL β 760 β 0.11 β 12.693 β 12.69 β 2 days, 1:04:00 β 315 0 445 β
β β β β β β β 41.4 β
βββββββββββββββββ΄βββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββββββββββ
LEFT OPEN TRADES REPORT
βββββββββ³βββββββββ³βββββββββββββββ³ββββββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββββββββββββ
β Pair β Trades β Avg Profit % β Tot Profit USDC β Tot Profit % β Avg Duration β Win Draw Loss Win% β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β TOTAL β 0 β 0.0 β 0.000 β 0.0 β 0:00 β 0 0 0 0 β
βββββββββ΄βββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββββββββββββ
ENTER TAG STATS
βββββββββββββ³ββββββββββ³βββββββββββββββ³ββββββββββββββββββ³βββββββββββββββ³ββββββββββββββββββ³βββββββββββββββββββββββββ
β Enter Tag β Entries β Avg Profit % β Tot Profit USDC β Tot Profit % β Avg Duration β Win Draw Loss Win% β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β OTHER β 760 β 0.11 β 12.693 β 12.69 β 2 days, 1:04:00 β 315 0 445 41.4 β
β TOTAL β 760 β 0.11 β 12.693 β 12.69 β 2 days, 1:04:00 β 315 0 445 41.4 β
βββββββββββββ΄ββββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββββββββββββ
EXIT REASON STATS
βββββββββββββββ³ββββββββ³βββββββββββββββ³ββββββββββββββββββ³βββββββββββββββ³ββββββββββββββββββ³βββββββββββββββββββββββββ
β Exit Reason β Exits β Avg Profit % β Tot Profit USDC β Tot Profit % β Avg Duration β Win Draw Loss Win% β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β roi β 270 β 5.02 β 203.248 β 203.25 β 2 days, 2:09:00 β 270 0 0 100 β
β exit_signal β 490 β -2.59 β -190.555 β -190.56 β 2 days, 0:28:00 β 45 0 445 9.2 β
β TOTAL β 760 β 0.11 β 12.693 β 12.69 β 2 days, 1:04:00 β 315 0 445 41.4 β
βββββββββββββββ΄ββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββββββββββββ
MIXED TAG STATS
βββββββββββββ³ββββββββββββββ³βββββββββ³βββββββββββββββ³ββββββββββββββββ³βββββββββββββββ³ββββββββββββββββ³βββββββββββββββββ
β β β β β Tot Profit β β β Win Draw β
β Enter Tag β Exit Reason β Trades β Avg Profit % β USDC β Tot Profit % β Avg Duration β Loss Win% β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β β roi β 270 β 5.02 β 203.248 β 203.25 β 2 days, β 270 0 β
β β β β β β β 2:09:00 β 0 100 β
β β exit_signal β 490 β -2.59 β -190.555 β -190.56 β 2 days, β 45 0 β
β β β β β β β 0:28:00 β 445 9.2 β
β TOTAL β β 760 β 0.11 β 12.693 β 12.69 β 2 days, β 315 0 β
β β β β β β β 1:04:00 β 445 41.4 β
βββββββββββββ΄ββββββββββββββ΄βββββββββ΄βββββββββββββββ΄ββββββββββββββββ΄βββββββββββββββ΄ββββββββββββββββ΄βββββββββββββββββ
SUMMARY METRICS
βββββββββββββββββββββββββββββββββ³ββββββββββββββββββββββββββββββββββ
β Metric β Value β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β Backtesting from β 2021-07-14 11:00:00 β
β Backtesting to β 2026-01-27 15:00:00 β
β Trading Mode β Isolated Futures β
β Max open trades β 2 β
β β β
β Total/Daily Avg Trades β 760 / 0.46 β
β Starting balance β 100 USDC β
β Final balance β 112.693 USDC β
β Absolute profit β 12.693 USDC β
β Total profit % β 12.69% β
β CAGR % β 2.67% β
β Sortino β 0.49 β
β Sharpe β 0.24 β
β Calmar β 0.85 β
β SQN β 0.76 β
β Profit factor β 1.06 β
β Expectancy (Ratio) β 0.02 (0.04) β
β Avg. daily profit β 0.008 USDC β
β Avg. stake amount β 15 USDC β
β Total trade volume β 22839.886 USDC β
β β β
β Best Pair β BTC/USDC:USDC 8.59% β
β Worst Pair β ETH/USDC:USDC 4.10% β
β Best trade β BTC/USDC:USDC 6.57% β
β Worst trade β ETH/USDC:USDC -10.70% β
β Best day β 3 USDC β
β Worst day β -2.084 USDC β
β Days win/draw/lose β 227 / 1105 / 318 β
β Min/Max/Avg. Duration Winners β 0d 00:00 / 12d 07:00 / 2d 06:44 β
β Min/Max/Avg. Duration Losers β 0d 01:00 / 8d 19:00 / 1d 21:03 β
β Max Consecutive Wins / Loss β 9 / 13 β
β Rejected Entry signals β 0 β
β Entry/Exit Timeouts β 0 / 0 β
β β β
β Min balance β 94.531 USDC β
β Max balance β 119.956 USDC β
β Max % of account underwater β 17.25% β
β Absolute drawdown β 19.701 USDC (17.25%) β
β Drawdown duration β 432 days 09:00:00 β
β Profit at drawdown start β 14.232 USDC β
β Profit at drawdown end β -5.469 USDC β
β Drawdown start β 2021-10-21 09:00:00 β
β Drawdown end β 2022-12-27 18:00:00 β
β Market change β 110.57% β
βββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββ
Backtested 2021-07-14 11:00:00 -> 2026-01-27 15:00:00 | Max open trades : 2
STRATEGY SUMMARY
βββββββββββββββ³βββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³ββββββββββββββββ³βββββββββββββββ
β β β β Tot Profit β β β Win Draw β β
β Strategy β Trades β Avg Profit % β USDC β Tot Profit % β Avg Duration β Loss Win% β Drawdown β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β ADXMomentum β 760 β 0.11 β 12.693 β 12.69 β 2 days, β 315 0 β 19.701 USDC β
β β β β β β 1:04:00 β 445 41.4 β 17.25% β
βββββββββββββββ΄βββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄ββββββββββββββββ΄βββββββββββββββ
~4.5 years of backtesting data with 760 trades using ADXMomentum strategy.
Example strategy: examples/strategies/ADXMomentum.py
Generate interactive HTML charts for analysis:
# Plot profit/loss over time
./freqtrade-gmx plot-profit --config configs/adxmomentum_gmx.example.json --auto-open
# Plot individual pair with indicators
./freqtrade-gmx plot-dataframe --config configs/adxmomentum_gmx.example.json \
--strategy ADXMomentum -p BTC/USDC:USDC --auto-openCharts are saved to user_data/plot/:
freqtrade-profit-plot.html- Cumulative profit chartfreqtrade-plot-BTC_USDC_USDC-1h.html- Price chart with indicators and trade markers
data/
βββ candles/arbitrum/{SYMBOL}/ # OHLCV candle data
β βββ 1m.parquet
β βββ 1h.parquet
β βββ 1d.parquet
βββ funding/arbitrum/
β βββ raw/
β β βββ funding/{SYMBOL}/partition=0/data.parquet # Raw Funding events
β β βββ fee_per_size/{SYMBOL}/data.parquet # Raw fee-per-size events
β βββ rates/{SYMBOL}/
β β βββ 1h.parquet # Unified hourly rates (merged)
β β βββ 1h_factor.parquet # HyperSync-only rates
β β βββ 1h_datastore.parquet # DataStore-only rates
β βββ direction/{SYMBOL}/1h.parquet # Hourly direction (who pays)
β βββ checkpoints/
β βββ funding_factor_checkpoint.json
β βββ fee_per_size_checkpoint.json
β βββ funding_datastore_checkpoint.json
βββ borrowing/arbitrum/
β βββ raw/borrowing/{SYMBOL}/partition=0/data.parquet # Raw Borrowing events
β βββ rates/{SYMBOL}/1h.parquet # Hourly aggregated borrowing rates
β βββ checkpoints/borrowing_factor_checkpoint.json # Borrowing resume checkpoint
βββ raw/arbitrum/{SYMBOL}/ # Raw oracle events
| Command | Description |
|---|---|
collect --default |
Collect GMX + Chainlink data (recommended, uses --full mode) |
collect --full |
Full historical collection (fetch all available data) |
collect --full --chainlink-only |
Collect only 34 Chainlink-feed markets (no HyperSync needed) |
collect --full --all-markets |
Collect all 118 markets (default, requires HyperSync) |
collect --update |
Incremental update (smart: only fetches new data) |
verify |
Verify data quality |
export-freqtrade |
Export OHLCV + funding rate + mark price to Freqtrade feather format |
debug-oracle |
Debug oracle events |
The collection system has smart data range checking to avoid refetching data you already have.
Collects all available historical data:
- β Fetches full GMX API window (~6 months of recent data)
- β Backfills ALL Chainlink historical data (from genesis to GMX coverage start)
- β Useful for: First-time collection, recovery from corrupted data
# Collect all data from scratch
gmx_historical_data collect --full --symbol ETH --output-dir ./dataSmart incremental updates that check existing data first:
- β Checks existing data before fetching
- β Skips fetching if data is already current (NO_GAP)
- β Fetches only gaps - from your latest timestamp to now (NORMAL_GAP)
- β Chainlink backfill - only if you need older data than you have
- β Merges new data with existing storage (deduplicates by timestamp)
# Daily incremental update (fast!)
gmx_historical_data collect --update --symbol ETH --output-dir ./dataPerformance Comparison:
| Scenario | --full Mode |
--update Mode |
Speedup |
|---|---|---|---|
| Daily update (ETH) | ~5-10 min (refetches all 500k+ Chainlink rounds) | ~10-30 sec (only new data) | 10-30x faster β‘ |
| Hourly update (BTC) | ~5-10 min | ~5-10 sec (or skipped if current) | 30-60x faster β‘ |
| Fresh collection | ~2-3 min | ~2-3 min (falls back to full) | Same |
Examples:
# Initial collection
gmx_historical_data collect --default --symbol ETH,BTC,SOL --output-dir ./data
# Daily updates (fast incremental)
gmx_historical_data collect --update --symbol ETH,BTC,SOL --output-dir ./data
# Update all 118 tokens incrementally
gmx_historical_data collect --update --output-dir ./data --concurrency 10When data is current:
β 1min: Data is current, skipping GMX API fetch
β 5min: Data is current, skipping GMX API fetch
β 1h: Data is current, skipping GMX API fetch
β Chainlink backfill not needed - data is complete
When gap exists:
β 1h: Fetching from GMX API (incremental)
β 1h: 24 candles from GMX (2026-01-27 to 2026-01-28)
β 1h: Merged 8,760 total candles (added 24 new)
| Option | Description | Default |
|---|---|---|
--output-dir PATH |
Output directory | ./data |
--symbol TEXT |
Token(s) - comma-separated for collect, repeatable for export-freqtrade |
All tokens |
--chainlink-only |
Only collect 34 markets with Chainlink feeds (no HyperSync needed) | Off (all markets) |
--all-markets |
Collect all 118 markets including non-Chainlink (requires HyperSync) | On (default) |
--concurrency INT |
Parallelism level (see below) | 4 |
Note:
collectuses comma-separated:--symbol ETH,BTC,SUIexport-freqtradeuses repeatable:--symbol ETH --symbol BTC
The --concurrency option controls both:
- Symbol parallelism: How many tokens are processed simultaneously
- RPC batch workers: Concurrent Chainlink data fetchers (auto-capped at 8)
--concurrency |
Symbols parallel | RPC batch workers | Use case |
|---|---|---|---|
| 1 | 1 | 1 | Slow/rate-limited RPC |
| 4 (default) | 4 | 4 | Balanced |
| 8 | 8 | 8 | Fast RPC endpoint |
| 10+ | 10+ | 8 (capped) | Maximum speed |
Examples:
# Default speed (4 parallel)
gmx_historical_data collect --default --output-dir ./data
# Fast collection (10 symbols + 8 RPC workers)
gmx_historical_data collect --full --output-dir ./data --concurrency 10
# Specific tokens with parallelism
gmx_historical_data collect --default --symbol ETH,BTC,SUI --concurrency 8Estimated times (with --concurrency 10):
Initial Collection (--full or --default):
- Single token (Chainlink): ~2-3 minutes
- Single token (non-Chainlink): ~30-60 seconds
- All 118 tokens: ~4-5 hours
Incremental Updates (--update):
- Single token (daily update): ~10-30 seconds
- Single token (hourly update): ~5-10 seconds (or instant if current)
- All 118 tokens (daily update): ~10-30 minutes
- All 118 tokens (hourly update): ~5-10 minutes
Speedup: Incremental updates are 10-30x faster than full collection because they only fetch new data and skip unnecessary Chainlink backfills.
For production use, set up a cron job or systemd timer for incremental updates:
# Cron job: Update every hour
0 * * * * cd /path/to/gmx_historical_data && source .venv/bin/activate && gmx_historical_data collect --update --output-dir ./data --concurrency 10 >> /var/log/gmx-update.log 2>&1
# Cron job: Update daily at 2 AM
0 2 * * * cd /path/to/gmx_historical_data && source .venv/bin/activate && gmx_historical_data collect --update --output-dir ./data --concurrency 10 >> /var/log/gmx-update.log 2>&1-
Initial Setup (once):
# Collect all historical data gmx_historical_data collect --default --output-dir ./data --concurrency 5 -
Regular Updates (hourly/daily):
# Fast incremental updates gmx_historical_data collect --update --output-dir ./data --concurrency 10 -
Verification (weekly):
# Check data quality gmx_historical_data verify --output-dir ./data -
Export for Trading (as needed):
# Export OHLCV + funding rate + mark price to Freqtrade gmx_historical_data export-freqtrade --data-dir ./data --output-dir ./freqtrade_data
The system automatically detects three types of data states:
-
NO_GAP - Data is current
β All timeframes up to date, skipping collection -
NORMAL_GAP - Normal gap (incremental fetch)
β 1h: Fetching from GMX API (incremental) β Merged 8,760 total candles (added 24 new) -
DATA_LOSS - GMX API window moved past your data
β Data loss detected in 2 timeframe(s) β Fetching from api_earliest (accepting loss)
Tip: For 1-minute timeframes, GMX API's window is only ~5 hours. Update more frequently to avoid data loss.
If poetry install hangs or fails:
# Clear poetry cache
poetry cache clear pypi --all
# Try with verbose output to see what's stuck
poetry install -vvv
# If dependency resolution is slow, try:
poetry install --no-cacheIf poetry can't find a compatible Python version:
# Check Python version (needs 3.11 or 3.12)
python --version
# Use pyenv to install correct version
pyenv install 3.11
pyenv local 3.11
poetry env use python3.11
poetry installNote: Python 3.13 requires additional workarounds. Use Python 3.11 or 3.12 for easiest installation.
HyperSync requires Rust and Cap'n Proto. If poetry install fails:
1. Install Rust toolchain:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env2. Install Cap'n Proto:
# Ubuntu/Debian
sudo apt-get install build-essential capnproto libcapnp-dev
# macOS
brew install capnp
# Fedora/RHEL
sudo dnf install capnproto capnproto-devel3. If still failing, try installing hypersync separately:
pip install --no-cache-dir --use-pep517 "hypersync==0.7.17"
poetry installNote: hypersync 0.8.x has a build issue with a missing GitHub dependency. Use 0.7.x versions until this is resolved.
4. Common errors:
error: linker 'cc' not foundβ Install build-essential/gcccapnp/capnp.h: No such fileβ Install libcapnp-devcargo not foundβ Source cargo env:source $HOME/.cargo/envPyO3's maximum supported versionβ Use Python 3.11 or 3.12, or setPYO3_USE_ABI3_FORWARD_COMPATIBILITY=1
Add multiple HyperSync tokens (comma-separated):
export HYPERSYNC_API_TOKEN="token1,token2,token3"Get free tokens at https://envio.dev
export JSON_RPC_ARBITRUM=$ARBITRUM_CHAIN_JSON_RPC
pytest tests/ -v

