This project downloads cricket data, stages parquet files, and builds dbt models.
- Download scripts now use jittered retries with a 3-pass max.
- Retry policy: retries transient failures (
5xx,408,409,425,429) and skips hard4xx. - Optional request pacing via
CRICK_MIN_REQUEST_INTERVAL(seconds).
Example:
CRICK_MIN_REQUEST_INTERVAL=0.08 make extractUse LEAGUE_ID to fetch only one ESPN league id (for example BBL id):
make extract LEAGUE_ID=200Use SEL to run a matches-folder-specific flow:
make pipeline SEL=WBBLThis does:
- player download only for players found in
data/rawdata/matches/WBBL(ordata/rawdata/WBBL) - match extraction for the same folder