Skip to content

Latest commit

 

History

History
541 lines (374 loc) · 27.6 KB

File metadata and controls

541 lines (374 loc) · 27.6 KB

FenixAI v2.5 Script Evolution and Release Surface

Status: release-candidate documentation draft
Scope: scripts and launch flows used across the multi-month v2.5 research, benchmark, and live-hardening cycle
Last updated: 2026-05-10

This document explains how the script families evolved over time, which entry points are safe to document for v2.5, and which scripts should be treated as historical experiment artifacts.

Executive Summary

The v2.5 cycle produced a large number of launchers, benchmark helpers, monitors, and repair scripts. That history goes back before the May live canaries: the project moved through benchmark factories, role-specific model tests, MiniFenix/NanoFenix prototypes, R-series live experiments, v25/v28 SOL canaries, and finally the v31 MTF-safe launcher.

That is normal for a research-heavy trading project, but it becomes dangerous when public release users cannot tell the difference between:

  • the current supported runner;
  • a live canary launcher for a specific account and symbol;
  • a historical benchmark launcher;
  • a local repair helper;
  • an analysis script used after a run.

For v2.5, the clean release story should be:

  1. run_fenix.py remains the simple default entry point.
  2. scripts/run_fenix_live_slot.py is the preferred controlled paper/testnet/live slot runner.
  3. scripts/run_fenix_live_suite.py is the preferred sequential comparison runner.
  4. run_nanofenixv3.py is the current NanoFenix companion entry point.
  5. scripts/launch_fenix_v31_live_mtf_safe.sh is the latest internal live-safe SOLUSDT canary launcher, not a generic public command.
  6. Older launch_fenix_*, launch_nanofenix_*, run_v21_*, and run_team_* scripts are reproducibility artifacts unless explicitly promoted later.

Historical Backbone

The current v2.5 release surface did not appear in one step. The scripts evolved through these main phases:

Period Main script family Purpose What survived into v2.5
Baseline and root runners run_fenix.py, run_minifenix.py, run_nanofenix*.py Simple startup paths and research entry points. run_fenix.py remains the default public entry point; run_nanofenixv3.py is the current companion path.
February benchmark factory run_benchmark_suite.py, generate_experiment_plans.py, analyze_benchmark_results.py Compare models, agent roles, timeframes, monolithic vs multi-agent mode, and ReasoningBank on/off. The benchmark discipline survived: slot metadata, JSON summaries, directional accuracy, and model-role selection.
v21 / role benchmark launchers launch_v21_*, run_v21_*, run_team_* Promote the best role/model combinations from benchmark into larger tests. Per-agent model teams became the standard; old launchers are historical artifacts.
March R-series live experiments launch_fenix_r10.sh through launch_fenix_r16_*, run_fenix_live_slot.py Move from paper-style research into controlled live slots with locks, risk guards, and event logs. run_fenix_live_slot.py became the preferred engine-first runner.
MiniFenix and NanoFenix evolution run_minifenix*.py, run_nanofenix.py, run_nanofenixv1.py, run_nanofenixv2.py, run_nanofenixv3.py Explore a slow-brain/fast-trigger architecture and a lightweight companion signal. NanoFenix v3 companion mode became the active v2.5 fast-signal layer.
April v25-v28 canaries launch_fenix_v25_*, launch_fenix_v26_*, launch_fenix_v27_*, launch_fenix_v28_* Test real Binance Futures behavior, protective orders, NanoFenix vetoes, live hydration, and SOL/ETH canaries. Conservative live execution, NanoFenix policy checks, live position hydration, and protection verification.
May v31 MTF-safe canary launch_fenix_v31_live_mtf_safe.sh Rebase the live-safe SOL flow onto the newer v31 launcher style and current model team. Current internal live-safe template for controlled SOLUSDT testing.

Current Recommended Release Surface

These are the scripts that should be easiest to explain in public v2.5 documentation.

Entry point Release role Recommended mode Why it matters
run_fenix.py Main application entry point Paper by default The simplest way to start the core bot.
scripts/run_fenix_live_slot.py Controlled single-slot runner Paper first, testnet second, live only with explicit flags Runs the real TradingEngine lifecycle and produces JSONL event logs plus summary JSON.
scripts/run_fenix_live_suite.py Controlled multi-slot runner Paper/testnet comparison Lets model teams, symbols, and timeframes be compared without rewriting launch commands.
run_nanofenixv3.py NanoFenix v3 companion / standalone runner Companion observer or paper research Produces the companion JSON signal used by Fenix for freshness, edge, uncertainty, and allow_execute.
run_minifenix.py MiniFenix research prototype Paper/research only Demonstrates the slow-brain / fast-trigger architecture that informed v2.5.
run_minifenix_model_sweep.py MiniFenix slow-brain model comparison Research only Helps compare LLM choices outside the main live engine.
scripts/run_hybrid_live_paper.py Hybrid MTF research runner Paper/research Useful for multi-timeframe paper experiments, but not the preferred live execution path.

Internal Live Canary Launchers

These scripts were important for the latest hardening work, but they should be documented as internal templates, not as generic public launchers.

Script Status What it does Release guidance
scripts/launch_fenix_v31_live_mtf_safe.sh Latest internal live-safe launcher Starts exactly one NanoFenix v3 observer companion and one Fenix v31-style live slot on scripts/run_fenix_live_slot.py. Uses 15m entry, deterministic 30m MTF guard, conservative risk caps, and the technical_mtf_qabba_guard lite consensus mode. Keep as a documented internal canary template. Do not present it as a one-size-fits-all public live command.
scripts/launch_fenix_v25_live_mtf_safe.sh Superseded but valuable First live-safe SOLUSDT MTF launcher from the May 2026 production test. It introduced the conservative NanoFenix/MTF live stack before v31 was adapted. Keep as history and fallback reference. Prefer the v31 live-safe variant for the current canary family.
scripts/launch_fenix_v31.sh Legacy v31 hybrid-paper launcher Older ETHUSDT launcher using scripts/run_hybrid_live_paper.py, hardcoded local paths, and a paper/hybrid flow. Do not recommend for live trading. It explains the v31 lineage but is not a live production launcher.

Evolution Timeline

1. Baseline root runners: simple entry points first

The earliest stable shape of the project kept the main entry points at the repository root.

Representative entry points:

  • run_fenix.py
  • run_minifenix.py
  • run_minifenix_model_sweep.py
  • run_nanofenix.py
  • run_nanofenix_live.py
  • run_nanofenixv1.py
  • run_nanofenixv2.py
  • run_nanofenixv3.py

What this phase established:

  • run_fenix.py is the human-friendly way to start the main application;
  • NanoFenix and MiniFenix were intentionally kept as research companions instead of being hidden inside the main runner;
  • each generation of NanoFenix got its own root entry point, which makes the historical progression visible;
  • by v2.5, run_nanofenixv3.py became the preferred companion signal path.

Release stance:

Keep run_fenix.py and run_nanofenixv3.py prominent. Document older NanoFenix entry points as lineage, not as the recommended path.

2. February benchmark factory: turning experiments into comparable slots

The February scripts created the first serious experiment factory. Instead of manually launching one-off runs, the project moved toward JSON plans, slot metadata, automatic analysis, and directional scoring against future candles.

Representative scripts:

  • scripts/run_benchmark_suite.py
  • scripts/generate_experiment_plans.py
  • scripts/analyze_benchmark_results.py
  • scripts/analyze_mixed_hourly_results.py
  • scripts/run_hybrid_live_paper.py
  • scripts/run_hybrid_hourly_models.py
  • scripts/run_hybrid_10h_mix.py

What these scripts taught:

  • benchmark logs must include run tags, slot names, timeframes, model teams, and role metadata;
  • directional accuracy must be measured against future market candles, not inferred from agent text;
  • monolithic and multi-agent modes need isolated metadata so their results do not contaminate each other;
  • ReasoningBank on/off comparisons need explicit flags;
  • time-interleaved plans reduce temporal bias when comparing models.

Release stance:

These are research support scripts. run_benchmark_suite.py and analyze_benchmark_results.py are still valuable for reproducing model-selection work, but they are not normal user launch commands.

3. v21 and role benchmarks: model choice became per-agent

The v21 family was the bridge between raw benchmark automation and practical model-role selection. It helped separate the question "which model is good?" from the more important question "which model is good for this agent role?"

Representative scripts:

  • scripts/launch_v21_improvements.py
  • scripts/launch_v21_parallel.sh
  • scripts/analyze_v21_results.py
  • scripts/run_v21_dualkey_production_test.sh
  • scripts/run_v21_nemotron_risk_test.sh
  • scripts/run_v21_risk_dedicated_key.sh
  • scripts/run_team_dualkey_test.sh
  • scripts/run_team_production_test.sh

What these scripts taught:

  • Technical, QABBA, Decision, Sentiment, Visual, and Risk should not automatically share one model;
  • JSON reliability and latency matter as much as reasoning quality in live-like loops;
  • separate API keys and role-specific tests are useful when provider limits distort results;
  • benchmark-winning teams still need live-slot validation before promotion.

Release stance:

Keep these launchers as reproducibility artifacts. Public v2.5 docs should explain the model-team idea, not send users to old v21 commands.

4. March R-series: engine-first live slots and operational safety

The R-series scripts moved the project from research loops into live-slot operations. This is where account/symbol locking, exchange reconciliation, runtime risk accounting, directional-score guards, and simple consensus modes started becoming practical.

Representative scripts:

  • scripts/run_fenix_live_slot.py
  • scripts/run_fenix_live_suite.py
  • scripts/run_split_keys_risk_local_smoke.sh
  • scripts/run_live_web3_monitor.py
  • scripts/launch_fenix_r10.sh
  • scripts/launch_fenix_r12.sh
  • scripts/launch_fenix_r13.sh
  • scripts/launch_fenix_r14_conservative.sh
  • scripts/launch_fenix_r15_BENCHMARK_CORRECTED.sh
  • scripts/launch_fenix_r15_live.sh
  • scripts/launch_fenix_r16_simple_consensus_live.sh
  • scripts/launch_fenix_r16_simple_mtf_live.sh
  • scripts/launch_fenix_testnet_r17_solusdc.sh

What these scripts taught:

  • live and paper runners need the same event visibility;
  • local PnL accounting is not enough without exchange fills and commissions;
  • running multiple live slots on the same account and symbol can contaminate position state;
  • strict Technical+QABBA consensus is safe but can become too conservative;
  • a higher-timeframe mathematical guard is useful when agent signals conflict;
  • benchmark success does not automatically survive live fees, slippage, latency, and exchange order behavior.

Release stance:

The R launchers are historical. The durable artifact is scripts/run_fenix_live_slot.py, which became the preferred v2.5 controlled runner.

5. MiniFenix and NanoFenix: slow brain, fast signal, companion policy

MiniFenix and NanoFenix evolved alongside the main engine. MiniFenix explored the two-speed concept: a slower reasoning layer plus faster trigger logic. NanoFenix moved through several generations until v3 became the practical companion signal.

Representative entry points and scripts:

  • run_minifenix.py
  • run_minifenix_model_sweep.py
  • run_nanofenix.py
  • run_nanofenix_live.py
  • run_nanofenixv1.py
  • run_nanofenixv2.py
  • run_nanofenixv3.py
  • scripts/nanofenix_pretrain.py
  • scripts/start_nanofenix_long_run.sh
  • scripts/launch_nanofenix_r10.sh
  • scripts/launch_nanofenix_r12.sh
  • scripts/launch_nanofenix_r13.sh
  • scripts/launch_nanofenix_r20.sh
  • scripts/launch_nanofenix_v35_parallel.sh
  • scripts/run_nanofenix_companion_signal.sh

What these scripts taught:

  • NanoFenix v0/v1/v2 were useful experiments, but raw microstructure and 1-second bars were too noisy without stronger calibration;
  • NanoFenix v3 added multi-scale features, dual horizons, adaptive fusion, runtime state, companion JSON, and explicit allow_execute;
  • the companion is most useful as a policy/filter layer, not as an unsupervised live executor;
  • stale companion files, symbol mismatches, uncertainty, and direction mismatch must be visible in the main engine;
  • old companion wrappers should not hide which NanoFenix generation they run.

Release stance:

For v2.5, document run_nanofenixv3.py --companion --adaptive-fusion as the preferred companion path. Keep older NanoFenix launchers as history.

6. April v25-v28 canaries: real exchange behavior became the center

The April launchers tested real Binance Futures behavior more directly. This period exposed issues that only show up when the exchange, protective orders, position hydration, order visibility, and NanoFenix policy interact.

Representative scripts:

  • scripts/launch_fenix_v25_paper_mainnet.sh
  • scripts/launch_fenix_v25_paper_glm51_gemma4.sh
  • scripts/launch_fenix_v26_live_eth_safe.sh
  • scripts/launch_fenix_v27_live_eth_aggressive.sh
  • scripts/launch_fenix_v28_live_eth_overnight.sh
  • scripts/launch_fenix_v28_sol_canary_screen.sh
  • scripts/run_hybrid_live_paper.py
  • scripts/fenix_trade_monitor.py
  • scripts/fenix_multi_monitor.py
  • scripts/fenix_multi_monitor_v2.py
  • scripts/_monitor_position.py

What these scripts taught:

  • launchers must load .env instead of carrying hardcoded credentials or local assumptions;
  • live runners must pass their built llm_config into the engine or model-team flags are meaningless;
  • protective SL/TP orders must be verified on the exchange before a trade is considered safely open;
  • a failed execution should not be counted as a losing trade by runtime risk;
  • open exchange positions must be hydrated into Fenix after restart;
  • NanoFenix hard vetoes must distinguish critical blockers from soft low-edge warnings;
  • screen sessions were more reliable than nohup for the SOL canary workflow in this local environment.

Release stance:

The v26-v28 launchers explain how the current safety model was discovered. They should be labeled as canary templates or historical runs, not as generic release commands.

7. v25 live MTF safe: first serious SOLUSDT production test

The v25 live-safe launcher was created when the project moved from paper and small canaries into a real SOLUSDT production test with a small funded Binance Futures account.

Primary scripts:

  • scripts/launch_fenix_v25_live_mtf_safe.sh
  • scripts/run_fenix_live_slot.py
  • run_nanofenixv3.py

Important behavior:

  • Fenix ran on 15m SOLUSDT.
  • A deterministic 30m MTF bias was added as a guard.
  • NanoFenix v3 ran as a companion observer, not as an executor.
  • Risk was intentionally small: 3x leverage, no add-to-position, low max notional, low risk per trade.
  • The engine produced JSONL live slot events for every cycle.

What changed during v25:

  • engine-side MTF context became real, not just environment variables;
  • NanoFenix companion freshness checks were tightened;
  • NanoFenix observer-only mode prevented companion paper state from polluting live interpretation;
  • the kline watchdog was tuned to avoid noisy polling between real 15m closes;
  • technical_mtf_qabba_guard was introduced to avoid a permanently blocked strict consensus;
  • lite node timeouts were added so slow LLM calls could not stretch a cycle indefinitely.
  • the suite/slot runner later gained explicit MTF guard options so paper plans no longer depend on hidden inherited shell variables.

Release stance:

v25 is the hardening milestone. It should be documented as the bridge between older experiments and the current v31 live-safe launcher.

8. Model correction: from parse stability to latency stability

During v25, the live tests exposed a model selection problem.

Observed sequence:

  • technical=nemotron-3-nano:30b-cloud produced repeated JSON parsing failures in Technical.
  • gemma4:31b-cloud improved Technical parse behavior but could be slow in the live stack.
  • qabba=ministral-3:14b-cloud was faster and more stable than the slower QABBA alternatives in the tested SOLUSDT setup.
  • In the v31 migration test, technical=ministral-3:14b-cloud,qabba=ministral-3:14b-cloud gave the best live-safe balance: valid responses and cycle times around a few to low tens of seconds.

Current live-safe v31 default:

technical=ministral-3:14b-cloud
qabba=ministral-3:14b-cloud
decision=nemotron-3-nano:30b-cloud
risk_manager=devstral-small-2:24b-cloud

Release stance:

Document model teams as examples, not guarantees. Model availability, latency, and JSON reliability can change.

9. v31 migration: newer family, live-safe runner

The original scripts/launch_fenix_v31.sh was not suitable for real live orders because it used the hybrid paper runner and a March ETHUSDT setup.

The v31 live-safe migration created:

  • scripts/launch_fenix_v31_live_mtf_safe.sh

This script preserved the newer v31 direction while moving execution onto the safer live slot runner.

Key improvements over the old v31 launcher:

  • uses scripts/run_fenix_live_slot.py instead of scripts/run_hybrid_live_paper.py;
  • avoids the old hardcoded ETH-only flow;
  • starts NanoFenix v3 observer-only companion with v31-specific runtime and signal paths;
  • stops prior SOLUSDT live/Nano processes before launch;
  • keeps BLAS/thread safety exports from the older v31 family;
  • uses the technical_mtf_qabba_guard lite consensus mode;
  • keeps conservative risk caps for small live accounts.

Release stance:

This is the latest internal live canary launcher. It should be described as a template for controlled experiments, not as a casual public entry point.

10. May 2026 live reconciliation fix

The v31 live-safe SOLUSDT run found an important live/local mismatch.

Problem:

  • Fenix hydrated a real Binance SHORT position after restart.
  • A rule-based time_exit closed the position in the local TradeManager.
  • The exchange position remained open because that close path had not sent a reduce-only order to Binance.

Fix:

  • rule-based live exits now execute reduce-only market closes;
  • the engine confirms Binance is flat before recording the local close;
  • live close records reconcile real exchange fill price, realized PnL, and commission;
  • if flat confirmation fails, Fenix rehydrates instead of pretending it is flat.

Files involved:

  • src/trading/engine.py
  • tests/test_engine_position_management.py
  • scripts/launch_fenix_v31_live_mtf_safe.sh

Release stance:

This is one of the strongest arguments for v2.5 being an execution-safety release. It should be included in the release notes as a safety fix, not hidden as an internal detail.

11. May 2026 live slot accounting fix

The next v31 live-safe SOLUSDT canary completed without opening any new orders, but its summary initially reported:

status=completed_with_accounting_gap
warnings=open_close_mismatch,more_closes_than_opens

The exchange state showed no open SOLUSDT position and no open orders. The accounting warning was therefore not live exposure; it was a release-readiness bug in the slot summary.

Root cause:

  • the engine correctly emitted position:hydrated when it found an inherited Binance position at startup;
  • the rule-based live exit then emitted position:closed after the reduce-only exchange close;
  • the summary accounting compared position_closed only against new position_opened events inside the slot;
  • because a hydrated exchange position is not a new Fenix entry, the accounting saw one close and zero opens.

Fix:

  • LiveEventCollector now counts position:hydrated;
  • _build_event_accounting_report() treats position_opened + position_hydrated as the closeable position base;
  • summary JSON includes position_hydrated and accounted_position_opens;
  • focused tests cover both collector counting and the hydrated-close accounting case.

Files involved:

  • scripts/run_fenix_live_slot.py
  • tests/test_run_fenix_live_slot.py

Release stance:

This is a summary/observability fix, not a trading-strategy improvement. It matters because v2.5 release reports should distinguish real exchange exposure from false accounting noise.

12. Release paper suite: risk isolation and no-trade interpretation

The final short release-candidate paper suite used scripts/run_fenix_live_suite.py with a dedicated plan:

  • plans/v25_release_massive_paper_20260510.json

The suite ran SOLUSDT paper slots across 1m, 3m, 5m, 15m, and 30m using the same v31 MTF/NanoFenix guard stack intended for controlled live canaries.

What this suite verified:

  • scripts/run_fenix_live_suite.py can orchestrate a multi-slot release check without leaving NanoFenix or Fenix screen sessions behind;
  • each timed paper slot gets an isolated FENIX_RISK_MANAGER_STORAGE_PATH, so previous live PnL no longer contaminates paper release summaries;
  • the MTF/NanoFenix guard stack remains conservative: the only BUY decision in the suite was hard-vetoed by NanoFenix for direction mismatch, high uncertainty, and low actionable edge;
  • all other decisions were HOLD because strict Technical+QABBA consensus did not approve an entry;
  • optional GARCH convergence warnings were identified as log noise and moved behind the supported arch show_warning=False fit option.

Release stance:

This suite is an execution and observability validation, not proof of trading profitability. A later long MTF/NanoFenix paper suite (20260517_224826_v25_long_paper_mtf_SOLUSDT_20260517_184816) expanded coverage to 1h and produced many SELL decisions, but it exposed a paper balance fallback defect: Fenix sizing saw a zero balance and skipped simulated entries as non-positive notional.

The follow-up long suite after the paper balance fallback fix (20260522_194204_v25_long_paper_mtf_SOLUSDT_20260522_154156) completed all six SOLUSDT paper slots and did not repeat the zero-balance/non-positive-notional defect. It still produced 85 HOLD decisions and no simulated trades. A review found the runner/plan mismatch: the suite name was MTF, but the suite did not explicitly pass FENIX_STRICT_MTF_BIAS_TIMEFRAME, so the guard logged _mtf_bias: {} and blocked as mtf=HOLD(0.00). The May 24 fix makes those MTF fields first-class plan/slot options and blocks the optional QABBA+MTF entry path when Technical is only a timeout fallback.

The release still needs a fresh long paper/testnet soak with the explicit MTF fields active if the goal is to observe fills under changing market regimes.

Analysis and Monitoring Scripts

These scripts are valuable after a run, but they should not be treated as launchers.

Recommended analysis folder:

  • scripts/analysis/analyze_recent_runs.py
  • scripts/analysis/compare_runs.py
  • scripts/analysis/fee_impact_report.py
  • scripts/analysis/check_real_fees.py
  • scripts/analysis/nanofenix_assessment.py
  • scripts/analysis/analyze_run.py
  • scripts/analysis/compare_results.py

Frequently used root-level analysis scripts:

  • scripts/analyze_agent_decisions.py
  • scripts/analyze_all_runs.py
  • scripts/analyze_all_trades_detailed.py
  • scripts/analyze_benchmark_results.py
  • scripts/analyze_directional_accuracy.py
  • scripts/analyze_llm_comparison.py
  • scripts/analyze_veto_breakdown.py
  • scripts/massive_benchmark_analysis.py

Monitoring helpers:

  • scripts/fenix_trade_monitor.py
  • scripts/fenix_multi_monitor.py
  • scripts/fenix_multi_monitor_v2.py
  • scripts/_monitor_position.py
  • scripts/monitor_nocturno.sh

Release stance:

Keep these as local developer tools. Public docs can point to scripts/analysis/ as the preferred place for post-run analysis, but should not imply every one-off analysis script is supported.

Fix and Repair Scripts

The scripts/fixes/ folder contains emergency and local repair helpers.

Examples:

  • scripts/fixes/close_orphan.py
  • scripts/fixes/close_orphan_positions.py
  • scripts/fixes/diagnose_binance.py
  • scripts/fixes/patch_nanofenix_barrier.py
  • scripts/fixes/verify_imports.py

Release stance:

These should not be promoted as public entry points. Before publishing v2.5, either archive them, document them as internal repair tools, or exclude them from the public package.

Benchmark and Research Scripts

Benchmark scripts are useful for reproducibility, but they often encode old model teams and old assumptions.

Examples:

  • scripts/run_benchmark_suite.py
  • scripts/run_decision_benchmark_sequential.py
  • scripts/run_risk_benchmark_parallel.py
  • scripts/run_massive_model_benchmark_r18.py
  • scripts/run_hybrid_hourly_models.py
  • scripts/run_team_dualkey_test.sh
  • scripts/run_team_production_test.sh
  • scripts/launch_dualkey_benchmarks.sh
  • scripts/launch_benchmark_r18_overnight.sh

Release stance:

Document these as research support. They are not the recommended path for normal users.

NanoFenix Script Surface

Current preferred entry point:

  • run_nanofenixv3.py

Supporting scripts:

  • scripts/nanofenix_pretrain.py
  • scripts/start_nanofenix_long_run.sh
  • scripts/launch_nanofenix_v35_parallel.sh

Historical wrappers:

  • scripts/launch_nanofenix_r10.sh
  • scripts/launch_nanofenix_r12.sh
  • scripts/launch_nanofenix_r13.sh
  • scripts/launch_nanofenix_r20.sh
  • scripts/run_nanofenix_companion_signal.sh

Release stance:

For v2.5, prefer run_nanofenixv3.py --companion --adaptive-fusion. Older NanoFenix wrappers are historical unless they are intentionally updated.

What Should Be Removed From Public Guidance

Do not recommend these as fresh public commands:

  • hardcoded local-path launchers;
  • account-specific live canary launchers;
  • one-off repair scripts;
  • old paper/hybrid launchers that look live but do not execute real orders;
  • scripts with old model teams that are known to parse poorly or run too slowly;
  • any command that assumes a specific API key index, balance, symbol, or private run tag.

This does not mean every old script must be deleted. Historical scripts are useful for reproducibility. The release docs just need to label them honestly.

Suggested Documentation Structure for v2.5

Use these documents together:

Document Purpose
docs/releases/v2.5.md Short official release overview.
docs/releases/v2.5-development-history.md Long-form story from v2.0 to v2.5.
docs/releases/v2.5-new-systems.md Component guide for NanoFenix, MiniFenix, Fenix Experimental, runners, and model teams.
docs/releases/v2.5-script-evolution.md Script history, current release surface, and launcher status.
scripts/README.md Practical scripts index.
RELEASE_CHECKLIST.md Final local checklist before publishing.

Final Release Recommendation

For the v2.5 official release, present the project like this:

FenixAI v2.5 is a safer and more observable research release. The main supported execution path is the core Fenix engine plus the live slot runner, with NanoFenix v3 available as a companion signal. Recent v25 and v31 live canaries were used to harden MTF consensus, NanoFenix policy, lite node timeouts, live position hydration, and exchange-fill reconciliation. Historical launchers remain in the repository for reproducibility, but users should start from the documented v2.5 entry points.

Avoid presenting the current script collection as if every file is equally current or production-ready.