Skip to content

Commit 74ce7f8

Browse files
BiomeOS Developercursoragent
andcommitted
S223: deep debt — smart refactor, sleep elimination, test speed
Smart-refactored btsp/json_line.rs (905 LOC → 478 + relay.rs 255 + negotiate.rs 189): zero production files >800 LOC. Relay handshake and Phase 3 negotiate now live in cohesive modules. Eliminated 4 test sleeps with deterministic alternatives: - ember/held_resource: removed 5ms thread::sleep, tightened assertion - ember/lend_reclaim: removed 1ms sleep (nanos resolution sufficient) - distributed/registry: backdated health_timestamp instead of 150ms sleep - intrusion: evolved to tokio::time::Instant + start_paused=true Fixed auto_config test suite blocking 10s on mDNS daemon and DNS resolution: discover_local_services, discover_wellknown_services, and discover_mdns_services now skip network I/O in cfg!(test) mode. 237 tests: 10.02s → 0.24s (41x speedup). Stale /tmp/ecoPrimals/discovery/ comments evolved to capability-based. Confirmed zero production unwrap(). +12 tests (resource_validator analysis: identify_gaps + generate_warnings). TCP probe timeout reduced 100ms → 5ms for tests. Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent 9cc1fbe commit 74ce7f8

17 files changed

Lines changed: 847 additions & 675 deletions

File tree

CONTEXT.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ ToadStool is the **Layer 0** hardware substrate that other primals and springs d
3030
- Family: `compute-{family_id}.sock` / `compute-{family_id}-tarpc.sock`
3131
- **Peer primals**: Resolved at runtime via capability IDs and Unix-socket discovery (e.g. `capability.discover`, `resolve_capability_socket_fallback`) — not hardcoded URLs or legacy per-primal env manifests
3232
- **Discovery hierarchy** (primalSpring cross-cutting): Songbird `ipc.resolve` → biomeOS `capability.discover` → UDS filesystem convention → socket registry → TCP probing. toadStool implements tiers 1–4; TCP probing (tier 5) not used for local IPC
33-
- **Tests**: 22,821 (7,884+ lib-only, 0 failures, unlimited parallelism)
33+
- **Tests**: 22,833 (7,896+ lib-only, 0 failures, unlimited parallelism)
3434
- **Unsafe**: 46 blocks (all in hw-safe/GPU/VFIO/display/plugin containment, all SAFETY-documented; reconciled S221); workspace `unsafe_code = "deny"`, 41 crates `forbid` + 5 hw crates with narrow `#[allow(unsafe_code, reason)]`; all lint attrs have `reason =` (S211+S213)
3535
- **async-trait**: DEPRECATED — fully removed and banned in `deny.toml` (S203r); transitive only via axum/config/wiggle
3636
- **deny.toml**: `ring` + `async-trait` + `zstd-sys` bans active (ecoBin v3 compliant)

DEBT.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,21 @@
11
# Active Technical Debt Register
22

3-
**Date**: May 2026 — S222
3+
**Date**: May 2026 — S223
44
**Philosophy**: Math is universal, precision is silicon. Workarounds are
55
short-term solutions that increase debt. We aim to solve deep debt over
66
iterations, evolving toward vendor-agnostic, capability-based solutions—
77
with production stubs surfacing typed configuration errors and capability
88
guidance, and auth policy driven by explicit environment configuration
99
where applicable.
1010

11+
**S223 (Deep Debt — json_line.rs Smart Refactor + Coverage + Stale Comments)**:
12+
Smart-refactored `btsp/json_line.rs` (905 LOC → 478 + 255 relay.rs + 189 negotiate.rs);
13+
zero files >800 LOC in production code. Production `unwrap()` audit: confirmed zero
14+
production unwraps (all in tests/docs). Fixed stale hardcoded `/tmp/ecoPrimals/discovery/`
15+
comments in `integration/protocols/src/client/discovery.rs` → capability-based language.
16+
Added 12 inline tests to `server/src/resource_validator/analysis.rs` (`identify_gaps` +
17+
`generate_warnings`). All quality gates green.
18+
1119
**S222 (primalSpring ironGate Audit — Sandbox + Env Expansion + Specs + Discovery)**:
1220
Responded to primalSpring ironGate provenance pipeline audit (3 gaps for toadStool):
1321
- **Gap 2 — Sandbox `working_dir` override**: `apply_security_context` now accepts

DOCUMENTATION.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# ToadStool Documentation Hub
22

3-
**Last Updated**: May 2026 — S222
3+
**Last Updated**: May 2026 — S223
44

55
---
66

@@ -30,11 +30,11 @@ These root documents were **fully resolved** and **fossilized** in wateringHole
3030

3131
---
3232

33-
## Current State (S222 — May 2026)
33+
## Current State (S223 — May 2026)
3434

3535
**Post-budding, dependency-sovereign, IPC-first, fully concurrent, capability-based.** barraCuda is a separate primal at `ecoPrimals/barraCuda/`. ToadStool is the hardware infrastructure layer — GPU/NPU/CPU discovery, capability probing, workload orchestration, and shader dispatch.
3636

37-
- **22,821 tests** (7,884+ lib-only), 0 failures, 0 clippy warnings, 0 fmt diffs. Full workspace concurrent test suite.
37+
- **22,833 tests** (7,896+ lib-only), 0 failures, 0 clippy warnings, 0 fmt diffs. Full workspace concurrent test suite.
3838
- **65 JSON-RPC methods** (incl. `compute.execute` direct route S203f). Wire Standard L3 (partial): `cost_estimates`, `operation_dependencies`. IPC compliant (`health.liveness``{"status":"alive"}`, `health.readiness` → ready+version, `health.check` full envelope, `capabilities.list`, `identity.get`).
3939
- **Dual-socket IPC**`compute.sock` (JSON-RPC primary, biomeOS routes here) + `compute-tarpc.sock` (tarpc hot-path). Override: `TOADSTOOL_SOCKET` / `TOADSTOOL_TARPC_SOCKET`. Family: `compute-{fid}.sock` / `compute-{fid}-tarpc.sock`.
4040
- **Pipeline dispatch**`compute.dispatch.pipeline.submit` + `.status` for ordered multi-stage workloads (DAG, topological sort, result forwarding). Resolves neuralSpring PG-05.

NEXT_STEPS.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# ToadStool -- Next Steps
22

33
**Updated**: May 2026 — S221 (Deep Debt — Capability Names + Dep Hygiene + Coverage)
4-
**Status**: Production-grade | Rust edition **2024** (MSRV 1.85) | **AGPL-3.0-or-later** | **All quality gates green** | tests verified (22,821 workspace, 0 failures) | **~65 JSON-RPC methods** | Wire Standard L3 (partial) | Zero C FFI deps (ecoBin v3.0) | **Zero production panics/expects** | IPC-first | workspace `unsafe_code = "deny"`, **41 crates `forbid`** | **46 unsafe blocks** (all in hw containment, all SAFETY-documented, reconciled S221) | **0 production TODOs** | **rustix 1.x workspace-wide** | **capability-based primal references (no hardcoded names, S221)** | **`async-trait` DEPRECATED** (banned in `deny.toml`) | **`deny.toml` ring + async-trait + zstd-sys bans active** | **BTSP Phase 3 encrypted channel (ChaCha20-Poly1305, S215)** | **BTSP Phase 3 transport switch verified (S218)** | **BTSP handshake bounded + connection-reused** (PG-46 resolved, S214) | **All lint attrs with reason (S211+S213)** | **Auth issuer capability-based (S209)** | **Self-registration with Songbird (S207)** | **Encrypted compute dispatch (Phase 55)** | **Display Phase 2 (petalTongue IPC)** | **BTSP JSON-line relay (Phase 45c)** | **Orchestrator lock-panic-free (S213)**
5-
**Latest**: S222primalSpring ironGate audit response: sandbox working_dir override via trusted_directories, env var expansion in workload TOMLs, display.composite + transport.bridge specs, discovery hierarchy documented, convert_security_context wired. +26 tests. 22,821 tests, 0 failures.
4+
**Status**: Production-grade | Rust edition **2024** (MSRV 1.85) | **AGPL-3.0-or-later** | **All quality gates green** | tests verified (22,833 workspace, 0 failures) | **~65 JSON-RPC methods** | Wire Standard L3 (partial) | Zero C FFI deps (ecoBin v3.0) | **Zero production panics/expects** | IPC-first | workspace `unsafe_code = "deny"`, **41 crates `forbid`** | **46 unsafe blocks** (all in hw containment, all SAFETY-documented, reconciled S221) | **0 production TODOs** | **rustix 1.x workspace-wide** | **capability-based primal references (no hardcoded names, S221)** | **`async-trait` DEPRECATED** (banned in `deny.toml`) | **`deny.toml` ring + async-trait + zstd-sys bans active** | **BTSP Phase 3 encrypted channel (ChaCha20-Poly1305, S215)** | **BTSP Phase 3 transport switch verified (S218)** | **BTSP handshake bounded + connection-reused** (PG-46 resolved, S214) | **All lint attrs with reason (S211+S213)** | **Auth issuer capability-based (S209)** | **Self-registration with Songbird (S207)** | **Encrypted compute dispatch (Phase 55)** | **Display Phase 2 (petalTongue IPC)** | **BTSP JSON-line relay (Phase 45c)** | **Orchestrator lock-panic-free (S213)**
5+
**Latest**: S223Deep debt: smart-refactored btsp/json_line.rs (905→478+255+189 LOC), eliminated 4 test sleeps (deterministic time), auto_config test speedup 10s→0.24s (mDNS/DNS skip in tests), confirmed zero production unwrap(), +12 tests. 22,833 tests, 0 failures.
66

77
---
88

@@ -33,7 +33,7 @@ syntax fixed in 3 server files. Test suite fully unblocked.
3333

3434
### P1: Test Coverage → 90% (D-COV) — Ongoing (S164)
3535

36-
**~83.6% line coverage** (lib-only, 185K lines instrumented). **22,821 tests** (0 failures). Target 90%.
36+
**~83.6% line coverage** (lib-only, 185K lines instrumented). **22,833 tests** (0 failures). Target 90%.
3737

3838
**S164** expanded coverage with **+94 new tests** across 7 low-coverage files:
3939
- `resource_validator.rs` 20% → ~75% (+19 tests)

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ Nest = Tower + Storage <- storage
4242
| `cargo fmt --all -- --check` | 0 diffs |
4343
| `cargo clippy --workspace --all-targets -- -D warnings` | 0 warnings |
4444
| `cargo doc --workspace --no-deps` (RUSTDOCFLAGS="-D warnings") | 0 warnings |
45-
| `cargo test --workspace` | **22,821 tests, 0 failures** (7,884+ lib-only), **~222** ignored (hardware-gated); full workspace ~7m |
45+
| `cargo test --workspace` | **22,833 tests, 0 failures** (7,896+ lib-only), **~222** ignored (hardware-gated); full workspace ~7m |
4646
| Doctests | All passing (common, core, server, cli, testing, display) |
4747
| Standalone clone test | Pull to any machine, `cargo test` works (GPU-optional, CPU fallback, device-lost resilient) |
4848
| `unsafe` blocks | **46 actual** (all in hw-safe/GPU/VFIO/display/plugin containment crates); all SAFETY-documented (S204, reconciled S221); workspace `unsafe_code = "deny"`, **41 crates `forbid`** + 5 hw crates with narrow `#[allow(unsafe_code, reason)]`; **all lint attrs have `reason =`** (S211+S213) |
@@ -275,6 +275,7 @@ toadStool/
275275
- **NUCLEUS crypto integration** -- compute payloads encrypted via Tower `crypto.encrypt`/`crypto.decrypt` (S205); **self-registration with Songbird** via `DISCOVERY_SOCKET` + `ipc.register` at startup (S207)
276276

277277
### Recently Completed
278+
- **S223 (May 6, 2026)**: **Deep Debt — Smart Refactor + Sleep Elimination + Test Speed** — Smart-refactored `btsp/json_line.rs` (905→478 LOC + `relay.rs` 255 + `negotiate.rs` 189; zero files >800 LOC). Eliminated 4 test sleeps: `thread::sleep` → deterministic time manipulation (backdated timestamps, `tokio::time::pause`). Evolved `intrusion.rs` to `tokio::time::Instant` for testable time. Fixed `auto_config` ecosystem discovery blocking 10s on mDNS daemon startup/shutdown and DNS resolution in tests — `cfg!(test)` early-returns skip network I/O (237 tests: 10.02s→0.24s, 41x speedup). Confirmed zero production `unwrap()`. Stale `/tmp/ecoPrimals/discovery/` comments evolved to capability-based language. +12 tests (resource_validator/analysis). 22,833 tests.
278279
- **S222 (May 5, 2026)**: **primalSpring ironGate Audit — Sandbox + Env Expansion + Specs + Discovery** — Gap 2: sandbox `working_dir` override via `trusted_directories` (was hardcoded `/tmp`). Gap 8: `${VAR}`/`$VAR` env expansion in workload TOMLs. `display.composite` + `transport.bridge` specs added (PG-42). Discovery hierarchy documented. `convert_security_context` now parses isolation level. +26 tests. 22,821 tests.
279280
- **S221 (May 4, 2026)**: **Deep Debt — Capability Names + Dep Hygiene + Stub Evolution + Coverage** — All `barraCuda/coralReef` primal names in errors/deprecations evolved to capability-based language. reqwest 0.12→0.13 (ring→aws-lc-rs). `verify_migration_success` dead code fixed. Unsafe count reconciled 49→46. +20 tests (daemon routes 11, zero_config config 9). 22,580 tests.
280281
- **S220 (May 4, 2026)**: **primalSpring Phase 58 Audit Response — Coverage Push + Stub Evolution** — Responded to 4-item audit. Phase 3 encryption confirmed RESOLVED (S215+S218). +22 tests (wasm/metrics, stub_runtime_engine, os_layer/manager, container/engine). `OSLayerManager` fallback evolved from synthetic success to `not_supported`. 22,560 tests.
@@ -354,7 +355,7 @@ See [CHANGELOG.md](CHANGELOG.md) for full session-by-session detail.
354355

355356
| ID | Description | Status |
356357
|----|-------------|--------|
357-
| D-COV | Test coverage → 90% | Active — 22,821 tests; ~83.6% lib-only line (185K instrumented); remaining gap: hardware-dependent paths (VFIO, DRM, V4L2, akida) |
358+
| D-COV | Test coverage → 90% | Active — 22,833 tests; ~83.6% lib-only line (185K instrumented); remaining gap: hardware-dependent paths (VFIO, DRM, V4L2, akida) |
358359
| D-BTSP-PHASE3 | BTSP encrypted post-handshake channel | **RESOLVED** (S215+S218) — ChaCha20-Poly1305 encrypted channel implemented, transport switch verified |
359360

360361
### Resolved (S94b)
@@ -397,7 +398,7 @@ See [DEBT.md](DEBT.md) for full register and evolution paths.
397398

398399
---
399400

400-
**Last Updated**: May 2026 — S222 (primalSpring ironGate Audit — Sandbox + Env Expansion + Specs). **22,821** workspace tests, 0 failures (7,884+ lib-only). ~83.6% lib-only line coverage (target 90%). **65 JSON-RPC methods** (direct) + semantic registry with **Wire Standard L3** (cost_estimates + operation_dependencies). AGPL-3.0-or-later. Zero C FFI deps (ecoBin v3.0). **46 unsafe blocks** (all in hw-safe/GPU/VFIO/display/plugin containment crates); all SAFETY-documented; workspace `unsafe_code = "deny"`, **41 crates `forbid`** + 5 hw crates with narrow `#[allow(unsafe_code, reason)]`. **Zero production panics/expects**. IPC-first JSON-RPC (dual-socket: `compute.sock` + `compute-tarpc.sock`). Rust 1.85+ (edition 2024, MSRV). **BTSP Phase 3 encrypted channel** (ChaCha20-Poly1305, S215; transport switch verified S218). **PG-46 resolved** (socket reuse, S214). **Self-registration** with Songbird via `DISCOVERY_SOCKET` (S207). **Capability-based discovery compliant** per `CAPABILITY_BASED_DISCOVERY_STANDARD.md` v1.2.
401+
**Last Updated**: May 2026 — S223 (Deep Debt — Smart Refactor + Sleep Elimination + Test Speed). **22,833** workspace tests, 0 failures (7,884+ lib-only). ~83.6% lib-only line coverage (target 90%). **65 JSON-RPC methods** (direct) + semantic registry with **Wire Standard L3** (cost_estimates + operation_dependencies). AGPL-3.0-or-later. Zero C FFI deps (ecoBin v3.0). **46 unsafe blocks** (all in hw-safe/GPU/VFIO/display/plugin containment crates); all SAFETY-documented; workspace `unsafe_code = "deny"`, **41 crates `forbid`** + 5 hw crates with narrow `#[allow(unsafe_code, reason)]`. **Zero production panics/expects**. IPC-first JSON-RPC (dual-socket: `compute.sock` + `compute-tarpc.sock`). Rust 1.85+ (edition 2024, MSRV). **BTSP Phase 3 encrypted channel** (ChaCha20-Poly1305, S215; transport switch verified S218). **PG-46 resolved** (socket reuse, S214). **Self-registration** with Songbird via `DISCOVERY_SOCKET` (S207). **Capability-based discovery compliant** per `CAPABILITY_BASED_DISCOVERY_STANDARD.md` v1.2.
401402

402403
---
403404

crates/auto_config/src/ecosystem/discoverer.rs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,11 @@ impl EcosystemDiscoverer {
202202
pub(crate) async fn discover_local_services(
203203
&self,
204204
) -> ToadStoolResult<HashMap<String, ServiceInfo>> {
205+
if cfg!(test) {
206+
debug!("Skipping local network probing in test mode");
207+
return Ok(HashMap::new());
208+
}
209+
205210
let mut services = HashMap::new();
206211

207212
for (capability_key, pattern) in &self.service_patterns {
@@ -259,6 +264,11 @@ impl EcosystemDiscoverer {
259264
pub(crate) async fn discover_wellknown_services(
260265
&self,
261266
) -> ToadStoolResult<HashMap<String, ServiceInfo>> {
267+
if cfg!(test) {
268+
debug!("Skipping well-known host probing in test mode");
269+
return Ok(HashMap::new());
270+
}
271+
262272
let mut services = HashMap::new();
263273

264274
for host in wellknown_hosts::ALL {
@@ -283,6 +293,11 @@ impl EcosystemDiscoverer {
283293
/// Delegates to `toadstool_common::primal_integration::try_discover_via_mdns`
284294
/// which probes `_toadstool._tcp.local.` and filters by capability TXT records.
285295
pub(crate) fn discover_mdns_services() -> HashMap<String, ServiceInfo> {
296+
if cfg!(test) {
297+
debug!("Skipping mDNS probing in test mode");
298+
return HashMap::new();
299+
}
300+
286301
use toadstool_common::primal_integration::try_discover_via_mdns;
287302

288303
let mut services = HashMap::new();

crates/auto_config/src/ecosystem_network.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ use toadstool_config::defaults::network::{
1919
use toadstool_config::env_config::EnvironmentConfig;
2020

2121
/// TCP connect timeout for `probe_service`. Production uses 2s; tests use a
22-
/// short value so discovery probes fail fast under `cargo test`.
22+
/// minimal value so discovery probes fail fast under `cargo test`.
2323
const TCP_PROBE_CONNECT_TIMEOUT: Duration = if cfg!(test) {
24-
Duration::from_millis(100)
24+
Duration::from_millis(5)
2525
} else {
2626
Duration::from_secs(2)
2727
};

0 commit comments

Comments
 (0)