Skip to content

Latest commit

 

History

History
164 lines (127 loc) · 9.88 KB

File metadata and controls

164 lines (127 loc) · 9.88 KB

Red Lines Dashboard

Live: https://sdspieg.github.io/redlines-dashboard/

Interactive analytics dashboard for Russian red line statements (RRLS), nuclear threat statements (NTS), and civilizational red line statements (CRLS) from the RuBase annotation pipeline.

Stack

  • React 19 + TypeScript + Vite 7
  • Plotly.js (react-plotly.js) for all visualizations
  • GitHub Pages (static deploy to gh-pages branch)
  • Dark theme, Inter font, responsive layout

Data Pipeline (fully automated)

Data flows automatically from the annotation database to the live dashboard every day:

VPS (daily 03:00 UTC)                  GitHub Actions                  GitHub Pages
┌─────────────────────┐    dispatch    ┌──────────────────────┐       ┌────────────┐
│  run_pipeline.sh    │──────────────>│  update-data.yml     │──────>│  Live site │
│                     │               │                      │       │            │
│  1. migrate.py      │               │  1. checkout gh-pages│       │  data/*.json│
│  2. chunker.py      │               │  2. run export script│       │  (updated) │
│  3. first_pass_async│               │  3. commit & push    │       │            │
│  4. rls_second_pass │               │                      │       │            │
│  5. nts_second_pass │               │  Fallback: 04:30 UTC │       │            │
└─────────────────────┘               └──────────────────────┘       └────────────┘
        │                                       │
        v                                       v
   PostgreSQL                          scripts/export_redlines_data.py
   (redlines DB)                       (queries DB → 23 JSON files)

How it works

  1. VPS cron (03:00 UTC): run_pipeline.sh ingests new documents, chunks them, runs 3-pass LLM annotation (first-pass screening, RLS second-pass taxonomy, NTS second-pass taxonomy), writes results to PostgreSQL
  2. VPS triggers GitHub Actions: At the end of the pipeline, a curl sends a repository_dispatch event (pipeline-complete) to this repo
  3. GitHub Actions (.github/workflows/update-data.yml): Checks out the gh-pages branch, runs scripts/export_redlines_data.py which queries the database and generates 23 JSON files, then commits and pushes only the changed data files
  4. No rebuild needed: The React app is already built on gh-pages. Only data/*.json files are replaced (~30 seconds total)

Triggers

Trigger When Purpose
repository_dispatch After VPS pipeline completes Primary: immediate update
schedule (04:30 UTC) Daily Safety net if dispatch fails
workflow_dispatch Manual On-demand via GitHub UI or gh workflow run update-data.yml

GitHub Secrets

Secret Description
DB_HOST PostgreSQL host (Hetzner VPS)
DB_PORT PostgreSQL port
DB_USER Database user
DB_PASSWORD Database password

VPS files

File Location Purpose
run_pipeline.sh /stratbase/apps/webapps/red-lines-database/ Main pipeline script (cron)
.gh_token same directory GitHub token for dispatch trigger
.env same directory OpenRouter API key for LLM annotation
first_pass_async.py same directory 1st pass: relevance screening (50 concurrent)
rls_second_pass.py same directory 2nd pass: RRLS taxonomy annotation
nts_second_pass.py same directory 2nd pass: NTS taxonomy annotation

Local-compute analytics (statsmodels / scipy)

The heavy scripts/export_analytics_data.py step (VAR, Granger, IRF, LP, event-study via statsmodels + scipy) does not run on the VPS — the VPS pipeline at /stratbase/apps/webapps/redlines-dashboard-pipeline/run_pipeline.sh intentionally skips it. Instead it runs on the WSL desktop where the dependencies + the RTX 4090 already live, the result JSONs (analytics_*.json) are committed to this repo, and the next VPS pipeline run picks them up via git pull and bundles them into the vite build that ships to gh-pages. Same pattern as the other static JSONs (Oryx, UkrDailyUpdate, KIU, Kaggle missiles).

Refresh whenever the redlines DB has new annotations you want reflected on the Causal Analytics tab:

~/.local/bin/update_redlines_analytics.sh              # full: export + commit + push
~/.local/bin/update_redlines_analytics.sh --dry-run    # export only
~/.local/bin/update_redlines_analytics.sh --no-push    # commit but don't push

The wrapper:

  1. git pull the local clone at /home/stephan/src/redlines-dashboard
  2. Loads DB creds from /mnt/g/My Drive/SYSTEM_CREDENTIALS.env (the PG_WARDATASETS_* block — same host, port 5432; the export script then targets both the redlines and war_datasets DBs)
  3. Runs scripts/export_analytics_data.py (typically 3-6 min)
  4. Writes public/data/last_refreshed_analytics.json with the UTC timestamp + host + elapsed seconds
  5. Commits the changed analytics_*.json files + the stamp, pushes to both origin (hcss-utils) and upstream (sdspieg)
  6. Next VPS pipeline run (cron 0 4 * * * UTC) picks them up and deploys — or trigger immediately: ssh root@138.201.62.161 'bash /stratbase/apps/webapps/redlines-dashboard-pipeline/run_pipeline.sh'

A small banner on the Causal Analytics tab reads last_refreshed_analytics.json and shows the freshness ("Causal Analytics JSONs last refreshed Thu, 28 May 2026 21:55 UTC (0.0 days ago)"); after 7 days the banner turns yellow with a "stale, please re-run" hint.

Tradeoff: the VPS pipeline is no longer fully self-contained — the analytics JSONs only refresh when the desktop runs the wrapper. For the current cadence (annotation passes are weekly at best) this is fine.

Dashboard Tabs

  1. Overview - Stat cards, funnel charts, slope chart (RRLS vs NTS rank comparison), by-source and by-database breakdowns
  2. RRLS Taxonomy Explorer - 18 dimension dropdown, bar charts (absolute + relative), source breakdown, time series, ordinal severity timeline, expandable monthly breakdowns, two-dropdown cross-tabulation heatmap, confidence slider
  3. NTS Explorer - 15 dimension dropdown, same structure as RRLS with NTS-specific severity scores, cross-tabulation, confidence slider
  4. CRLS Explorer - Framing types pie chart, territories bar chart, monthly trend, by-source breakdown
  5. Time Series - Absolute/relative counts, war context dual-axis overlay (personnel losses, ACLED events), source trend lines
  6. Statement Browser - Paginated card view with search highlighting, dimension filter dropdowns, confidence slider, source filter

Data Files (public/data/, 23 JSON files)

Generated by scripts/export_redlines_data.py from the redlines and war_datasets PostgreSQL databases.

File Description
overview_stats.json Headline counts (docs, chunks, confirmed statements)
rrls_statements.json All confirmed RRLS statements with full taxonomy fields (~8MB)
nts_statements.json All confirmed NTS statements with full taxonomy fields (~1.5MB)
rrls_taxonomy.json Per-source dimension breakdowns (15 pre-computed dims)
rrls_taxonomy_totals.json Totals per dimension value (15 dims)
rrls_taxonomy_time.json Monthly time series for 4 key dimensions
nts_taxonomy.json NTS dimension breakdowns (15 dims)
nts_severity_monthly.json Monthly severity by tone/conditionality/consequences/specificity
rrls_by_source.json RRLS counts by source
nts_by_source.json NTS counts by source
crls_by_source.json CRLS counts by source
rrls_monthly.json RRLS monthly counts by source
nts_monthly.json NTS monthly counts by source
crls_monthly.json CRLS monthly counts by source
chunks_by_source.json Total chunks per source (denominator for classification rates)
chunks_monthly.json Total chunks per month
crls_framing_types.json CRLS civilizational framing type distribution
crls_territories.json CRLS sphere-of-influence territories
rrls_cross_tabs.json Legacy pre-computed cross-tabs (4 combos)
rrls_intensity.json Line/threat intensity distributions
comparative_by_db.json RRLS/NTS counts by database
war_context_personnel.json Monthly Russian personnel losses (from war_datasets DB)
war_context_acled.json Monthly ACLED conflict events and fatalities

Development

# Local development
cd /tmp/redlines-dashboard
npm install
npm run dev          # Vite dev server with hot reload

# Export fresh data locally (uses hardcoded DB credentials as fallback)
python scripts/export_redlines_data.py

# Build and deploy app changes (only needed when React code changes)
npm run build
npx gh-pages -d dist

# Trigger data update manually (no rebuild needed)
gh workflow run update-data.yml

Color System (src/colors.ts)

  • RRLS_COLORS: 18 dimensions with fixed hex per value
  • NTS_COLORS: 15 dimensions with fixed hex per value
  • Ordinal dimensions: sequential green-yellow-red scales (SEQ3/SEQ4/SEQ5/SEQ7)
  • Categorical dimensions: distinct tab20 palette colors
  • getDimValueColor(colorMap, dim, value, fallbackIndex) - universal color lookup

Dynamic Dimension Computation

The RRLS Explorer computes 3 additional dimensions (line_intensity, threat_intensity, overall_confidence) client-side from raw rrls_statements.json since these are not in the pre-computed taxonomy files. The dynamicTotals, dynamicTaxonomy, and dynamicTaxTime useMemo hooks fill the gaps. When the confidence slider is adjusted (> 7), all dimensions are recomputed from filtered statements.