feat: safety hardening + exception hierarchy + replay harness with CI gate#9
Merged
ArielB1980 merged 1 commit intomainfrom Feb 14, 2026
Merged
Conversation
… gate Capital safety (P0): - DRY_RUN enforced at KrakenClient transport boundary - Global order rate limiter in ExecutionGateway (60/min, 10/10s) - max_loss_per_trade_usd risk check in risk manager - Trading activity heartbeat/deadman switch via runtime/heartbeat.txt Exception hierarchy (Tier 1): - Replace bare except/pass in kill switch, live trading, safety integration - OperationalError → retry/backoff, InvariantError → halt, unknown → crash - Narrow circuit breaker classification (whitelist networky errors only) - Route stop self-heal + ShockGuard through ExecutionGateway Replay backtest harness: - Event-driven harness replaying real LiveTrading._tick() against simulated Kraken exchange (ReplayKrakenClient) with deterministic SimClock - Faithful exchange modeling: stop entered_book lifecycle, maker/taker via mid-crossing, reduceOnly caps at flat, position reversal as two fills - Order rejection realism (min size, reduceOnly conflict, insufficient margin) - Layer 1 visibility quirk toggle (entered_book hidden from open orders) - Per-symbol funding rate curves with vol-spike variability - Deterministic seeded jitter on fills, delays, slippage (--seed N) - Per-API-call latency model (50-200ms seeded) - FaultInjector for scripted outages, rate limits, data errors - 6 episodes: normal, high-vol, drought, outage, restart/split-brain, bug - Safety-first pass/fail criteria per episode - CI gate (.github/workflows/replay-gate.yml) runs on PRs touching execution/risk/safety/client/live paths, matrix across 3 jitter seeds - make replay, make replay-episode, make replay-sweep targets Docs & cleanup: - Archive obsolete docs and scripts - Update FORAI.md with lessons learned - 500/500 unit tests passing (49 replay harness tests) Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive safety hardening of the live trading stack, structured exception handling across all critical paths, and a production-faithful event-driven replay backtest harness with CI integration.
Capital Safety (P0)
KrakenClient.place_futures_order()refuses real orders whendry_run=True— no silent simulationExecutionGateway— catches runaway loops, recursion bugsmax_loss_per_trade_usd: Risk manager rejects trades where stop distance x size exceeds configurable max dollar lossruntime/heartbeat.txtupdated each tick — external watchdog detects hung loops, DNS hangs, event loop starvationException Hierarchy (Tier 1)
except Exception: passin kill switch, live trading, and safety integration with structured handling:OperationalError-> retry/backoff (bounded)InvariantError-> halt (fail-fast)DataError-> log + skipExecutionGateway(single choke point, consistent WAL/breaker/logging)Replay Backtest Harness
ReplayKrakenClient: Drop-in simulated Kraken exchange with:entered_booklifecycle with vol/depth-dependent delay + seeded jitterreduceOnlycaps at flat; non-reduce can flip with two logical fillshide_entered_book_from_open_orders)--seed N)FaultInjector: Scripted API timeouts, rate limits, data errors, AttributeError at specific timestampsmake replay,make replay-sweep(seeds 1-5)CI Gate
.github/workflows/replay-gate.ymltriggers on PRs touchingexecution/,risk/,safety/,kraken_client.py,live/,circuit_breaker.py,replay_harness/Docs and Cleanup
docs/archive/andscripts/archive/FORAI.mdwith lessons learnedpre_flight_check,sync_positions,recover_sl_order_ids,check_tp_coverage,monitor_trade_executionTest plan
make smokewith.env.localon local machine (requires credentials)make replaypasses all 6 episodesmake deploy-> verify systemd restart + heartbeat file appearsMade with Cursor