Skip to content

Commit ab2d8af

Browse files
westgatewestgate
authored andcommitted
S198: Deep debt evolution — gap matrix, capability discovery, module refactoring
- TS-01 RESOLVED: coralReef discovery evolved to unified capability.discover - BTSP Phase 2: handshake enforced on all UDS accept paths (tarpc + daemon) - Health triad: liveness/readiness/check shaped per Wire Standard L1/L2 - OpenCL deprecated: ocl removed, stubs retained for API compat - 6 large files smart-refactored into module directories (all <500 lines) - Hardcoded endpoints evolved to SocketPathEnv capability-based discovery - BearDog token refresh: placeholder evolved to real async RPC - Embedded stubs: thiserror platform-specific error types - Unsafe hardened: repr(C) VFIO, fd validation, debug assertions - #[allow] -> #[expect] where lint fires; ~80 justified #[allow] remain - Musl-static binary: 11MB x86_64 PIE stripped - Docs updated to S198, handoff created, debris cleaned - 21,864 tests, 0 failures, 0 clippy warnings, 0 fmt diffs Made-with: Cursor
1 parent 0e3549d commit ab2d8af

270 files changed

Lines changed: 6517 additions & 6607 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@ jobs:
141141
sudo apt-get install -y libssl-dev pkg-config bc
142142
- name: Run coverage (tier 1)
143143
run: |
144-
cargo llvm-cov --workspace --all-features --ignore-filename-regex "tests/" -- --skip performance
144+
cargo llvm-cov --workspace --all-features --ignore-filename-regex "tests/" -- --skip performance_bench --skip slow
145145
- name: Generate coverage report
146146
run: |
147147
cargo llvm-cov report --json --output-path coverage.json

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,9 @@ test_results.txt
6363
models/huggingface/
6464
models/akida/
6565

66+
# Stray build artifacts
67+
rust_out
68+
6669
# Fossilized directories (moved to ecoPrimals/fossil/)
6770
results/
6871
ui/

CONTEXT.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ ToadStool is the **Layer 0** hardware substrate that other primals and springs d
2424
- **Binary**: `toadstool` (UniBin standard — single binary, subcommands)
2525
- **ecoBin grade**: v3.0 (zero application-level C dependencies)
2626
- **Socket**: `$XDG_RUNTIME_DIR/biomeos/toadstool.sock` (+ `compute.sock` capability symlink)
27+
- **Peer primals**: Resolved at runtime via capability IDs and Unix-socket discovery (e.g. `capability.discover`, `resolve_capability_socket_fallback`) — not hardcoded URLs or legacy per-primal env manifests.
2728

2829
## Not Included
2930

Cargo.toml

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ members = [
6060
"examples",
6161
]
6262

63-
exclude = []
63+
exclude = ["fuzz"]
6464

6565
[workspace.package]
6666
version = "0.1.0"
@@ -154,9 +154,15 @@ bollard = "0.18" # Docker API
154154
aes-gcm = "0.10.3"
155155
ed25519-dalek = { version = "2.0", features = ["rand_core"] }
156156
hex = "0.4"
157-
# EVOLVED: hmac removed — unused; no crate imports it
158157
sha2 = "0.10"
159158

159+
# BTSP handshake (BTSP_PROTOCOL_STANDARD.md): pure Rust, no C FFI
160+
x25519-dalek = { version = "2.0", features = ["static_secrets"] }
161+
hkdf = "0.12"
162+
hmac = "0.12"
163+
chacha20poly1305 = "0.10"
164+
rand = "0.8"
165+
160166
# Networking
161167
# EVOLVED: ipnet removed (Mar 12, 2026) — unused; no crate references it
162168
# mdns removed: standardized on mdns-sd; edge stub doesn't need it

DEBT.md

Lines changed: 226 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,245 @@
11
# Active Technical Debt Register
22

3-
**Date**: April 8, 2026 — S194
3+
**Date**: April 9, 2026 — S198
44
**Philosophy**: Math is universal, precision is silicon. Workarounds are
55
short-term solutions that increase debt. We aim to solve deep debt over
66
iterations, evolving toward vendor-agnostic, capability-based solutions.
77

88
## Active Debt
99

10-
### D-TARPC-PHASE3
10+
### D-TARPC-PHASE3-BINARY
1111
**Crate**: `integration/protocols` | **Feature**: `tarpc-transport`
12-
`TRpcTransport::send_message` returns `TRpcTransportNotAvailable` — tarpc binary
13-
transport is not yet wired. Evolve once `toadstool_common::tarpc_service` API
14-
stabilizes and coordination mesh negotiation supports protocol switching.
12+
`TRpcTransport::send_message` now wired (S197): resolves target primal's Unix
13+
socket via capability discovery and sends via JSON-RPC 2.0 (the universal
14+
protocol per wateringHole). **Phase 3b**: negotiate tarpc binary framing for
15+
eligible Rust-to-Rust peers when coordination mesh supports protocol switching.
1516
Files: `transport.rs`.
1617

1718
### D-EMBEDDED-PROGRAMMER
1819
**Crate**: `runtime/specialty` | **Feature**: `embedded-placeholder-impls`
1920
ISP/ICSP/parallel programmer trait impls return `EmbeddedProgrammerPlaceholder` errors.
20-
Evolve when hardware-specific transport layers (USB, parallel, serial) land.
21+
**Partially resolved S198**: `thiserror`-based platform-specific error types replace generic placeholders; behavior remains placeholder until hardware-specific transport layers (USB, parallel, serial) land.
2122
Files: `embedded/programmer_impls.rs`, `embedded/programmers.rs`.
2223

2324
### D-EMBEDDED-EMULATOR
2425
**Crate**: `runtime/specialty` | **Feature**: `embedded-placeholder-impls`
2526
MOS 6502 / Z80 emulator trait impls return `EmbeddedEmulatorPlaceholder` errors.
26-
Evolve when cycle-accurate CPU cores are implemented.
27+
**Partially resolved S198**: Typed errors as above; cycle-accurate CPU cores and real emulation still deferred.
2728
Files: `embedded/emulator_impls.rs`, `embedded/emulators.rs`.
2829

30+
### D-COVERAGE-GAP
31+
**Scope**: Workspace | **Metric**: `cargo llvm-cov`
32+
Line coverage at 83.6% (target: 90%). Gap concentrated in integration crates,
33+
runtime backends (GPU/container/WASM), and distributed coordination paths.
34+
`cudarc` blocker resolved S197 (removed). `--all-features` should now work on
35+
machines without CUDA toolkit.
36+
Files: `scripts/run-coverage.sh`, `.github/workflows/ci.yml`.
37+
38+
### D-FUZZ-TARGETS-UNSAFE
39+
**Crate**: `runtime/gpu` | Scope: `unified_memory/buffer/access.rs`
40+
Remaining unsafe surface: two `from_raw_parts(_mut)` call sites. Safety docs
41+
now accurately distinguish runtime-checked vs backend-contract-assumed invariants.
42+
Dead u8 alignment check removed S197. Evolution: fuzz the access paths, consider
43+
`NonNull::slice_from_raw_parts` for fat-pointer representation.
44+
Files: `buffer/access.rs`.
45+
46+
### D-FUZZ-TARGETS — PARTIAL S197
47+
**Scope**: Workspace | **Dir**: `fuzz/`
48+
Initial `cargo-fuzz` / `libfuzzer` infrastructure added (S197). Three targets:
49+
`fuzz_jsonrpc_parse` (JSON-RPC 2.0 deser), `fuzz_config_toml` (config deser +
50+
validation), `fuzz_btsp_framing` (BTSP length-prefixed frame decode).
51+
Remaining: integrate into CI, add seed corpus, run extended campaigns, add
52+
proptest bridge for property-based input generation.
53+
Files: `fuzz/Cargo.toml`, `fuzz/fuzz_targets/*.rs`.
54+
55+
56+
## S198 Resolved Debt (TS-01 visualization, BTSP Phase 2 UDS, health triad)
57+
58+
### D-GAP-TS01-VISUALIZATION — RESOLVED S198
59+
**TS-01** closed for `crates/server/src/visualization_client.rs`: coralReef / shader-compiler discovery is unified on `capability.discover` (same direction as D-GAP-TS01-CAPABILITY-DISCOVERY S172-3 for `coral_reef_client.rs`). Removed `CORALREEF_SOCKET` / `CORALREEF_URL`, `coralreef-core.json` manifest, and coralreef directory scan.
60+
61+
### D-BTSP-PHASE2 — RESOLVED S198
62+
BTSP handshake is enforced on **all** Unix-domain-socket accept paths: `tarpc_server.rs`, `daemon/jsonrpc_server.rs` (pure JSON-RPC main server already required BTSP).
63+
64+
### D-GAP-HEALTH-TRIAD — RESOLVED S198
65+
Canonical shapes: `health.liveness``{"status":"alive"}`; `health.readiness``{"status":"ready","version":...}`; `health.check` → full health envelope (details per handler).
66+
67+
68+
## S197 Resolved Debt (Transport Wiring, Fuzz Infra, Clippy, Dep Audit)
69+
70+
### D-TARPC-PHASE3 — RESOLVED S197
71+
Wired `TRpcTransport::send_message` in `integration/protocols/transport.rs`.
72+
Transport resolves the target primal's Unix socket via capability-based
73+
discovery (`get_socket_path_for_capability`) and forwards via JSON-RPC 2.0
74+
(`UnixJsonRpcClient`), the universal protocol per wateringHole.
75+
Remaining: negotiate tarpc binary framing for Rust-to-Rust peers (Phase 3b).
76+
77+
### D-FUZZ-TARGETS-INIT — RESOLVED S197
78+
Created `fuzz/` directory with `cargo-fuzz` / `libfuzzer` infrastructure:
79+
- `fuzz_jsonrpc_parse`: JSON-RPC 2.0 request deserialization
80+
- `fuzz_config_toml`: ToadStool TOML config deser + `validate()`
81+
- `fuzz_btsp_framing`: BTSP length-prefixed frame decode via async `Cursor`
82+
Workspace `Cargo.toml` excludes `fuzz/` from workspace members.
83+
84+
### D-CLIPPY-SERVER-BLANKET — RESOLVED S197
85+
Server crate `#![allow(clippy::...)]` reduced from 34 suppressed lints to 5
86+
(doc_markdown, doc_comment_double_space_linebreaks, similar_names,
87+
struct_field_names, module_name_repetitions). All 51 warnings fixed:
88+
`#[must_use]` on builders, `let...else`, `unused_async`, `unused_self`,
89+
`items_after_statements`, `manual_let_else`, `unreadable_literal`,
90+
`unnecessary_debug_formatting`, `unnecessary_wraps`, `ref_option`.
91+
Similar cleanup applied to `auto_config` and `protocols` crates.
92+
93+
### D-DEP-AUDIT — RESOLVED S197
94+
Audited all workspace dependencies for non–pure-Rust surface. No `ring`,
95+
`openssl-sys`, `libc` (direct), or `sqlite` in production deps. All crypto
96+
uses pure Rust crates (ed25519-dalek, x25519-dalek, chacha20poly1305, sha2).
97+
Remaining native surface is hardware-facing (wgpu, drm, serialport, rustix)
98+
and properly feature-gated. `ocl`/`cl-sys` legacy OpenCL stack noted for
99+
future monitoring.
100+
101+
## S197 Earlier Resolved (Unsafe Tightening, VFIO Dedup, Legacy Names, Deps)
102+
103+
### D-UNSAFE-UNIFIEDMEMORY — RESOLVED S197
104+
Tightened `from_raw_parts(_mut)` safety in `unified_memory/buffer/access.rs`:
105+
- Removed dead `align_of::<u8>()` check (u8 alignment is always 1)
106+
- Rewrote safety documentation with tabular invariant-enforcement mapping
107+
that accurately distinguishes runtime-checked vs backend-contract-assumed
108+
- Documented what `validate_cpu_ptr` proves (allocation handle alive, NULL-page
109+
guard, non-zero size) vs what it assumes (backend maps `size` bytes, pointer
110+
remains mapped until `free_unified`)
111+
Files: `runtime/gpu/src/unified_memory/buffer/access.rs`.
112+
113+
### D-VFIO-DEDUP — RESOLVED S197
114+
Merged duplicate VFIO ioctl scaffolding:
115+
- Exported `VFIO_TYPE` and `VFIO_BASE` from `hw-safe::vfio_dma` as public constants
116+
- `nvpmu/src/vfio.rs` now imports from `hw-safe` instead of redeclaring
117+
- Removed deprecated `dma_map_fd`/`dma_unmap_fd` (zero callers outside definition)
118+
Files: `hw-safe/src/vfio_dma.rs`, `nvpmu/src/vfio.rs`.
119+
120+
### D-BTSP-EXPECT-EVOLVE — RESOLVED S197
121+
Evolved BTSP handshake `expect("HMAC accepts any key size")` on both client and
122+
server to fallible `map_err(|e| HandshakeError::KeyDerivation(...))`. Replaced
123+
`unwrap_or_default()` in `send_handshake_error` with compile-time fallback bytes.
124+
Files: `common/src/btsp/client.rs`, `common/src/btsp/server.rs`.
125+
126+
### D-LEGACY-NAME-CENTRALIZE — RESOLVED S197
127+
Evolved inline string literals `"beardog"`, `"songbird"`, `"nestgate"`, `"squirrel"`
128+
in production code to use centralized `interned_strings::primals::LEGACY_*_LABEL`
129+
constants. Key files: `cli/src/templates/capability_helpers.rs` (6 map insertions),
130+
`integration/primals/src/primal_types.rs` (4 match arms).
131+
132+
### D-CUDARC-DEPRECATE — RESOLVED S197
133+
Removed `cudarc` C-FFI dependency from `runtime/gpu`. CUDA dispatch is now
134+
handled by **barraCuda** (PTX, cuDNN, single-GPU) and **coralReef** (multi-GPU)
135+
via capability-based IPC — ToadStool discovers CUDA capability at runtime
136+
through the ecosystem mesh rather than embedding the NVIDIA toolchain.
137+
- Removed `cudarc = "0.19"` dependency and 5 source files (~33 KiB)
138+
- Replaced `cuda_impl/` with a deprecated stub pointing to barraCuda/coralReef
139+
- `cuda` feature flag retained as empty no-op for backward compat
140+
- `ai-ml` and `all-backends` features no longer pull CUDA
141+
- Removed `cudarc` from `deny.toml` skip-tree
142+
- Removed `FrameworkHandle::Cuda` variant from `types.rs`
143+
- Resolves **D-CUDARC-FEATURE-GATE**`--all-features` builds no longer
144+
require nvcc/CUDA toolkit
145+
Files: `runtime/gpu/Cargo.toml`, `runtime/gpu/src/backends/cuda_impl/*`,
146+
`runtime/gpu/src/types.rs`, `deny.toml`.
147+
148+
### D-WORKSPACE-DEPS-RECONCILE — RESOLVED S197
149+
Reconciled workspace dependency declarations for `regex`, `config`, and `hex`:
150+
- `regex`: `auto_config`, `runtime/specialty`, `security/policies``{ workspace = true }`
151+
- `config`: `distributed`, `management/analytics``{ workspace = true }`
152+
- `hex`: `cli`, `runtime/wasm`, `security/policies`, `neuromorphic/akida-models``{ workspace = true }`
153+
Eliminated version drift risk (inline versions `"1.0"` vs workspace `"1.10"`).
154+
155+
## S196 Resolved Debt (Socket Naming, BTSP Handshake, Framing, Family ID)
156+
157+
### D-SOCKET-DOMAIN-NAMING — RESOLVED S196
158+
Evolved socket naming from primal-based (`toadstool.sock`) to domain-based
159+
(`compute.sock` / `compute-{fid}.sock`) per `PRIMAL_SELF_KNOWLEDGE_STANDARD.md`
160+
v1.1. Legacy symlink `toadstool.sock → compute.sock` maintained during migration.
161+
Removed separate `.jsonrpc.sock` socket — unified to single domain-named socket.
162+
Updated `identity.get` to report `socket_name: "compute.sock"`.
163+
Files: `server/src/unibin/format.rs`, `server/src/unibin/mod.rs`,
164+
`common/src/constants/primal_identity.rs`, `common/src/primal_sockets/paths.rs`,
165+
`common/src/platform_paths/paths.rs`, `client/src/client/core.rs`,
166+
`core/toadstool/src/ipc/platform/{mod,unix}.rs`, showcase examples.
167+
168+
### D-BTSP-HANDSHAKE — RESOLVED S196
169+
Implemented full BTSP handshake per `BTSP_PROTOCOL_STANDARD.md` v1.0.0:
170+
- Client: `BtspClient::handshake()` (ephemeral X25519, HKDF-SHA256,
171+
HMAC-SHA256 challenge-response)
172+
- Server: `BtspServer::accept_handshake()` (verification + session keys)
173+
- Pure Rust crypto stack: `x25519-dalek`, `hkdf`, `hmac`, `sha2`,
174+
`chacha20poly1305` — no C FFI (ecoBin compliant)
175+
- Feature-gated behind `btsp` (default on)
176+
- Full round-trip test: handshake succeeds with matching seed, rejects
177+
with wrong seed, directional key agreement verified
178+
Files: `common/src/btsp/{mod,types,client,server,framing}.rs`,
179+
`common/Cargo.toml`, workspace `Cargo.toml`.
180+
181+
### D-BTSP-FRAMING — RESOLVED S196
182+
Implemented length-prefixed BTSP frame codec (4-byte BE u32, max 16 MiB) per
183+
`BTSP_PROTOCOL_STANDARD.md`. Server connection handler detects BTSP mode
184+
(`is_btsp_required()`) and switches between NDJSON (dev) and length-prefixed
185+
framing (production). `BtspFrameReader`/`BtspFrameWriter` types for typed access.
186+
Files: `common/src/btsp/framing.rs`, `server/src/pure_jsonrpc/connection/unix.rs`.
187+
188+
### D-FAMILY-ID-PRECEDENCE — RESOLVED S196
189+
Fixed `SocketPathEnv::from_env()` to read `TOADSTOOL_FAMILY_ID` first per
190+
`PRIMAL_SELF_KNOWLEDGE_STANDARD.md` v1.1 (`{PRIMAL}_FAMILY_ID → FAMILY_ID`).
191+
Previous order was `BIOMEOS_FAMILY_ID → TOADSTOOL_FAMILY`; now
192+
`TOADSTOOL_FAMILY_ID → TOADSTOOL_FAMILY → BIOMEOS_FAMILY_ID`.
193+
Files: `common/src/primal_sockets/env.rs`.
194+
195+
## S195 Resolved Debt (Standards Compliance, NDJSON, Logging, Benchmarks)
196+
197+
### D-SCYBORG-LICENSE — RESOLVED S195
198+
Added `LICENSE-ORC` (Open Research Commons) and `LICENSE-CC-BY-SA` (Creative Commons
199+
Attribution-ShareAlike 4.0) to complete the scyBorg triple license per
200+
`wateringHole/LICENSING_AND_COPYLEFT.md`. AGPL-3.0 was already present as `LICENSE`.
201+
202+
### D-TARPC-SERVER-GATE — RESOLVED S195
203+
Feature-gated `tarpc` on server crate per wateringHole `PRIMAL_IPC_PROTOCOL.md`
204+
(tarpc OPTIONAL, JSON-RPC REQUIRED). `tarpc`, `tokio-util`, `tokio-serde` now
205+
optional behind `tarpc` feature (default=on for backward compat). Modules
206+
`tarpc_server`, `rpc_types`, `coordinator_executor` gated with `#[cfg(feature = "tarpc")]`.
207+
Files: `server/Cargo.toml`, `server/src/lib.rs`.
208+
209+
### D-NDJSON-SESSION — RESOLVED S195
210+
Evolved `pure_jsonrpc` server Unix+TCP handlers from single-shot to persistent
211+
NDJSON sessions per `PRIMAL_IPC_PROTOCOL.md`. Connections now loop: read line →
212+
process → write response + newline → read next line until EOF. HTTP path remains
213+
single request-response. Backward compatible with existing single-request clients.
214+
Files: `connection/unix.rs`, `connection/tcp.rs`.
215+
216+
### D-LOGGING-INCONSISTENCY — RESOLVED S195
217+
Evolved `security/sandbox/{macos,windows}.rs` from `log::` to `tracing::` macros
218+
(structured fields). Aligns all crates on `tracing` as the single logging facade.
219+
Files: `sandbox/src/macos.rs`, `sandbox/src/windows.rs`.
220+
221+
### D-WATCHDOG-UNWRAP — RESOLVED S195
222+
Replaced `.lock().unwrap()` in `nvpmu/watchdog.rs` production thread with graceful
223+
mutex-poisoning handling. Watchdog exits loop on poison; `stop()` skips notify on
224+
poison. No more panic risk from std::sync::Mutex poisoning.
225+
Files: `nvpmu/src/watchdog.rs`.
226+
227+
### D-CI-SKIP-MISMATCH — RESOLVED S195
228+
Fixed CI coverage step `--skip performance` (overly broad) to match local script:
229+
`--skip performance_bench --skip slow`. Prevents skipping `testing::performance`
230+
module coverage (~360 lines).
231+
Files: `.github/workflows/ci.yml`.
232+
233+
### D-TOOLCHAIN-FILE — RESOLVED S195
234+
Added `rust-toolchain.toml` pinning stable channel with `rustfmt`, `clippy`,
235+
`llvm-tools-preview` components and musl cross-compile targets.
236+
237+
### D-BENCHMARKS — RESOLVED S195
238+
Created Criterion benchmark infrastructure. `server/benches/jsonrpc_throughput.rs`
239+
benchmarks `process_request` for `capabilities.list`, `health.liveness`, `identity.get`.
240+
`process_request` promoted to pub for bench access.
241+
Files: `server/benches/jsonrpc_throughput.rs`, `server/Cargo.toml`.
242+
29243
## S192-194 Resolved Debt (BTSP Guard, Headless GPU, Capability Field Evolution)
30244

31245
### D-CAPABILITY-FIELDS — RESOLVED S194
@@ -338,7 +552,11 @@ single-line delegations.
338552
Gap TS-01 closed. `coral_reef_client.rs` discovery evolved from identity-based
339553
(coralreef env vars, manifests, socket name scan) to capability-based: primary
340554
tier is now `$XDG_RUNTIME_DIR/biomeos/shader.sock` (per CAPABILITY_BASED_DISCOVERY_STANDARD v1.1).
341-
Identity-based tiers demoted to legacy fallback. `songbird_integration/discovery/client.rs`
555+
`crates/server/src/visualization_client.rs` aligned with the same standard: Tier 1 is
556+
`capability.discover("shader")`, then Tier 0 / 2 / 3 (`TOADSTOOL_SHADER_COMPILER_ADDR`,
557+
`shader.sock`, `ecoPrimals/shader_compile.sock`, capability-named `shader*.sock` scan only).
558+
Legacy `CORALREEF_*` env, `coralreef-core.json`, and `coralreef*.sock` identity fallbacks removed.
559+
`songbird_integration/discovery/client.rs`
342560
`clone()` evolved to prefer `coordination.sock` over `songbird.sock`. `resolve_primal()`
343561
and `connect_to_primal()` deprecated in favour of `find_by_capability()`.
344562

0 commit comments

Comments
 (0)