Skip to content

Commit 4a89a22

Browse files
westgatewestgate
authored andcommitted
S210: PG-46 — bounded BTSP handshake timeouts
Root cause: JSON-line BTSP relay made two sequential BearDog RPCs (btsp.session.create + btsp.session.verify) with no timeout. Slow BearDog responses caused the handshake to consume most of a client's time budget, leaving the first JSON-RPC read to return empty/timeout. Fix: - relay_json_line_handshake() now wrapped in total budget (5s default, override via BTSP_HANDSHAKE_TIMEOUT_SECS) - Each BearDog RPC call uses call_with_timeout (3s default, override via BTSP_RPC_TIMEOUT_SECS) - New BtspJsonLineError::Timeout variant for clear error reporting - New UnixJsonRpcClient::call_with_timeout method - Constants in timeouts.rs, env-var keys in socket_env.rs 7,842 lib-only tests, 0 failures, clippy clean (-D warnings), fmt clean. Made-with: Cursor
1 parent 41830a0 commit 4a89a22

6 files changed

Lines changed: 112 additions & 6 deletions

File tree

DEBT.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,21 @@
11
# Active Technical Debt Register
22

3-
**Date**: April 2026 — S209
3+
**Date**: April 2026 — S210
44
**Philosophy**: Math is universal, precision is silicon. Workarounds are
55
short-term solutions that increase debt. We aim to solve deep debt over
66
iterations, evolving toward vendor-agnostic, capability-based solutions—
77
with production stubs surfacing typed configuration errors and capability
88
guidance, and auth policy driven by explicit environment configuration
99
where applicable.
1010

11+
**S210 (PG-46: BTSP Handshake Timeout)**:
12+
Added bounded timeouts to JSON-line BTSP handshake relay. Total handshake
13+
budget: 5s default (`BTSP_HANDSHAKE_TIMEOUT_SECS`). Per-BearDog-RPC budget:
14+
3s default (`BTSP_RPC_TIMEOUT_SECS`). `UnixJsonRpcClient::call_with_timeout`
15+
added. `BtspJsonLineError::Timeout` variant for clear error reporting.
16+
Resolves PG-46 (short-timeout reads returning empty responses due to
17+
unbounded handshake latency). **7,842 lib-only** tests, 0 failures.
18+
1119
**S209 (Deep Debt — Lint Reason + Dep Unification + Auth Capability)**:
1220
Completed comprehensive lint evolution: all remaining crate-level `#![allow]`
1321
attrs evolved to include `reason =` (7 embedded/neuromorphic/native/testing

NEXT_STEPS.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# ToadStool -- Next Steps
22

3-
**Updated**: April 2026 — S209 (Deep Debt — Lint Reason + Dep Unification + Auth Capability)
4-
**Status**: Production-grade | Rust edition **2024** (MSRV 1.85) | **AGPL-3.0-or-later** | **All quality gates green** | **7,842 lib-only** tests verified (20,000+ workspace, 0 failures) | **~65 JSON-RPC methods** | Wire Standard L3 (partial) | Zero C FFI deps (ecoBin v3.0) | **Zero production panics/expects** | IPC-first | workspace `unsafe_code = "deny"`, **41 crates `forbid`** | **49 unsafe blocks** (all in hw containment, all SAFETY-documented) | **0 production TODOs** | **rustix 1.x workspace-wide** | **capability-based primal references (no hardcoded names)** | **`async-trait` DEPRECATED** (banned in `deny.toml`) | **`deny.toml` ring + async-trait + zstd-sys bans active** | **env centralized via config structs** | **real Linux sandbox (rustix)** | **real resource metrics (cgroup v2/proc)** | **plugin loading (libloading)** | **binary tarpc framing (MessagePack)** | **BTSP JSON-line relay (Phase 45c)** | **Display Phase 2 (petalTongue IPC)** | **Encrypted compute dispatch (Phase 55)** | **All lint attrs with reason (S209)** | **test-mocks off by default (S206)** | **Self-registration with Songbird (S207)** | **Auth issuer capability-based (S209)**
5-
**Latest**: S209Deep Debt — Lint Reason + Dep Unification + Auth Capability: All remaining crate-level `#![allow]` attrs given `reason =` (7 crates). ~30 production `#[expect(deprecated)]`/`#[allow(deprecated)]` upgraded with `reason =`. 23 Cargo.toml files unified to workspace deps (`sha2`, `serde_json`, `tracing`, `thiserror`, `tracing-subscriber`, `tokio-test`). Auth backend: hardcoded `well_known::BEARDOG` issuer fallback → `capabilities::CRYPTO`. Stale python feature flags removed. **7,842 lib-only** tests, 0 failures, clippy clean, fmt clean.
3+
**Updated**: April 2026 — S210 (PG-46: BTSP Handshake Timeout)
4+
**Status**: Production-grade | Rust edition **2024** (MSRV 1.85) | **AGPL-3.0-or-later** | **All quality gates green** | **7,842 lib-only** tests verified (20,000+ workspace, 0 failures) | **~65 JSON-RPC methods** | Wire Standard L3 (partial) | Zero C FFI deps (ecoBin v3.0) | **Zero production panics/expects** | IPC-first | workspace `unsafe_code = "deny"`, **41 crates `forbid`** | **49 unsafe blocks** (all in hw containment, all SAFETY-documented) | **0 production TODOs** | **rustix 1.x workspace-wide** | **capability-based primal references (no hardcoded names)** | **`async-trait` DEPRECATED** (banned in `deny.toml`) | **`deny.toml` ring + async-trait + zstd-sys bans active** | **BTSP handshake bounded** (5s default, PG-46, S210) | **All lint attrs with reason (S209)** | **Auth issuer capability-based (S209)** | **Self-registration with Songbird (S207)** | **Encrypted compute dispatch (Phase 55)** | **Display Phase 2 (petalTongue IPC)** | **BTSP JSON-line relay (Phase 45c)**
5+
**Latest**: S210PG-46: BTSP Handshake Timeout. Added bounded timeouts to JSON-line BTSP handshake relay: 5s total budget (`BTSP_HANDSHAKE_TIMEOUT_SECS`), 3s per BearDog RPC (`BTSP_RPC_TIMEOUT_SECS`). `UnixJsonRpcClient::call_with_timeout` added. `BtspJsonLineError::Timeout` variant. Resolves short-timeout reads returning empty responses. **7,842 lib-only** tests, 0 failures, clippy clean, fmt clean.
66

77
---
88

crates/core/common/src/btsp/json_line.rs

Lines changed: 56 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
//! JSON newline BTSP handshake relay via BearDog JSON-RPC.
33
44
use std::path::PathBuf;
5+
use std::time::Duration;
56

67
use base64::Engine;
78
use serde::Deserialize;
@@ -10,6 +11,8 @@ use thiserror::Error;
1011
use tokio::io::{AsyncRead, AsyncReadExt, AsyncWrite, AsyncWriteExt};
1112

1213
use crate::ToadStoolError;
14+
use crate::constants::timeouts;
15+
use crate::interned_strings::socket_env;
1316
use crate::unix_jsonrpc::UnixJsonRpcClient;
1417

1518
use super::types::BtspCipher;
@@ -41,6 +44,10 @@ pub enum BtspJsonLineError {
4144
/// Protocol violation (version, missing field, etc.).
4245
#[error("BTSP JSON-line protocol: {0}")]
4346
Protocol(String),
47+
48+
/// Handshake or RPC call exceeded its timeout budget.
49+
#[error("BTSP JSON-line timeout: {0}")]
50+
Timeout(String),
4451
}
4552

4653
impl From<ToadStoolError> for BtspJsonLineError {
@@ -49,6 +56,20 @@ impl From<ToadStoolError> for BtspJsonLineError {
4956
}
5057
}
5158

59+
fn handshake_timeout() -> Duration {
60+
std::env::var(socket_env::BTSP_HANDSHAKE_TIMEOUT_SECS)
61+
.ok()
62+
.and_then(|v| v.parse::<u64>().ok())
63+
.map_or(timeouts::BTSP_HANDSHAKE_TIMEOUT, Duration::from_secs)
64+
}
65+
66+
fn rpc_timeout() -> Duration {
67+
std::env::var(socket_env::BTSP_RPC_TIMEOUT_SECS)
68+
.ok()
69+
.and_then(|v| v.parse::<u64>().ok())
70+
.map_or(timeouts::BTSP_RPC_TIMEOUT, Duration::from_secs)
71+
}
72+
5273
/// Check if a JSON line looks like a BTSP ClientHello.
5374
///
5475
/// The line must parse as JSON and carry `"protocol": "btsp"` (spacing-insensitive via serde).
@@ -199,12 +220,38 @@ async fn require_str_line<S: AsyncWrite + Unpin>(
199220
/// 5. Call BearDog `btsp.session.verify` with session_token, response, client_ephemeral_pub, preferred_cipher
200221
/// 6. Send HandshakeComplete JSON line
201222
///
223+
/// The entire handshake is bounded by `BTSP_HANDSHAKE_TIMEOUT` (default 5s,
224+
/// override via `BTSP_HANDSHAKE_TIMEOUT_SECS`). Each BearDog RPC call is
225+
/// individually bounded by `BTSP_RPC_TIMEOUT` (default 3s, override via
226+
/// `BTSP_RPC_TIMEOUT_SECS`).
227+
///
202228
/// On error at any step, sends an error JSON line and returns `Err`.
203229
pub async fn relay_json_line_handshake<S: AsyncRead + AsyncWrite + Unpin>(
204230
stream: &mut S,
205231
first_line: &str,
206232
family_seed: &str,
207233
security_socket: &str,
234+
) -> Result<BtspSessionInfo, BtspJsonLineError> {
235+
let budget = handshake_timeout();
236+
let Ok(result) = tokio::time::timeout(
237+
budget,
238+
relay_json_line_handshake_inner(stream, first_line, family_seed, security_socket),
239+
)
240+
.await
241+
else {
242+
let msg = format!("BTSP handshake exceeded {budget:?} budget");
243+
tracing::warn!(target: "btsp", "{msg}");
244+
let _ = send_error_line(stream, &msg).await;
245+
return Err(BtspJsonLineError::Timeout(msg));
246+
};
247+
result
248+
}
249+
250+
async fn relay_json_line_handshake_inner<S: AsyncRead + AsyncWrite + Unpin>(
251+
stream: &mut S,
252+
first_line: &str,
253+
family_seed: &str,
254+
security_socket: &str,
208255
) -> Result<BtspSessionInfo, BtspJsonLineError> {
209256
tracing::info!(target: "btsp", "JSON-line BTSP: parsing ClientHello");
210257

@@ -235,9 +282,13 @@ pub async fn relay_json_line_handshake<S: AsyncRead + AsyncWrite + Unpin>(
235282
tracing::info!(target: "btsp", "JSON-line BTSP: calling btsp.session.create");
236283

237284
let rpc = UnixJsonRpcClient::new(security_socket);
285+
let rpc_budget = rpc_timeout();
238286
let family_seed_b64 = base64::engine::general_purpose::STANDARD.encode(family_seed.as_bytes());
239287
let create_params = serde_json::json!({ "family_seed": family_seed_b64 });
240-
let create_result: Value = match rpc.call("btsp.session.create", create_params).await {
288+
let create_result: Value = match rpc
289+
.call_with_timeout("btsp.session.create", create_params, rpc_budget)
290+
.await
291+
{
241292
Ok(v) => v,
242293
Err(e) => {
243294
let msg = e.to_string();
@@ -297,7 +348,10 @@ pub async fn relay_json_line_handshake<S: AsyncRead + AsyncWrite + Unpin>(
297348
"client_ephemeral_pub": hello.client_ephemeral_pub,
298349
"preferred_cipher": cr.preferred_cipher,
299350
});
300-
let verify_result: Value = match rpc.call("btsp.session.verify", verify_params).await {
351+
let verify_result: Value = match rpc
352+
.call_with_timeout("btsp.session.verify", verify_params, rpc_budget)
353+
.await
354+
{
301355
Ok(v) => v,
302356
Err(e) => {
303357
let msg = e.to_string();

crates/core/common/src/constants/timeouts.rs

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,23 @@ pub const VERIFICATION_PHASE_TIMEOUT: Duration = Duration::from_secs(5);
127127
/// Total zero-config bootstrap target (60 seconds)
128128
pub const ZERO_CONFIG_TARGET: Duration = Duration::from_secs(60);
129129

130+
// ============================================================================
131+
// BTSP Handshake Timeouts (PG-46)
132+
// ============================================================================
133+
134+
/// Total BTSP handshake budget (5 seconds).
135+
///
136+
/// Covers the full 4-step JSON-line relay including both BearDog RPCs.
137+
/// Clients using <10s read timeouts previously hit empty responses because
138+
/// the handshake had no upper bound. Override via `BTSP_HANDSHAKE_TIMEOUT_SECS`.
139+
pub const BTSP_HANDSHAKE_TIMEOUT: Duration = Duration::from_secs(5);
140+
141+
/// Per-RPC timeout for BearDog calls during BTSP handshake (3 seconds).
142+
///
143+
/// Applied individually to `btsp.session.create` and `btsp.session.verify`.
144+
/// Override via `BTSP_RPC_TIMEOUT_SECS`.
145+
pub const BTSP_RPC_TIMEOUT: Duration = Duration::from_secs(3);
146+
130147
// ============================================================================
131148
// Authentication Timeouts
132149
// ============================================================================

crates/core/common/src/interned_strings/socket_env.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,10 @@ pub const TOADSTOOL_TCP_BIND_ADDRESS: &str = "TOADSTOOL_TCP_BIND_ADDRESS";
127127
/// Idle timeout (seconds) for Pure JSON-RPC TCP connections.
128128
pub const TOADSTOOL_TCP_IDLE_TIMEOUT_SECS: &str = "TOADSTOOL_TCP_IDLE_TIMEOUT_SECS";
129129
pub const TOADSTOOL_STANDALONE: &str = "TOADSTOOL_STANDALONE";
130+
/// Override total BTSP handshake timeout (seconds). Default: 5.
131+
pub const BTSP_HANDSHAKE_TIMEOUT_SECS: &str = "BTSP_HANDSHAKE_TIMEOUT_SECS";
132+
/// Override per-RPC timeout for BearDog calls during BTSP (seconds). Default: 3.
133+
pub const BTSP_RPC_TIMEOUT_SECS: &str = "BTSP_RPC_TIMEOUT_SECS";
130134

131135
pub const TOADSTOOL_PORT: &str = "TOADSTOOL_PORT";
132136
pub const TOADSTOOL_REQUEST_TIMEOUT: &str = "TOADSTOOL_REQUEST_TIMEOUT";

crates/core/common/src/unix_jsonrpc_client.rs

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -174,6 +174,29 @@ impl UnixJsonRpcClient {
174174
.ok_or_else(|| ToadStoolError::network("JSON-RPC response missing result"))
175175
}
176176

177+
/// Call JSON-RPC method with a wall-clock timeout.
178+
///
179+
/// Wraps [`call`](Self::call) in `tokio::time::timeout`.
180+
///
181+
/// # Errors
182+
///
183+
/// Returns error if the timeout elapses or the inner call fails.
184+
pub async fn call_with_timeout(
185+
&self,
186+
method: &str,
187+
params: Value,
188+
timeout: std::time::Duration,
189+
) -> ToadStoolResult<Value> {
190+
tokio::time::timeout(timeout, self.call(method, params))
191+
.await
192+
.map_err(|_| {
193+
ToadStoolError::network(format!(
194+
"RPC call {method} to {} timed out after {timeout:?}",
195+
self.socket_path.display()
196+
))
197+
})?
198+
}
199+
177200
/// Call JSON-RPC method and deserialize response
178201
///
179202
/// ## Type Safety

0 commit comments

Comments
 (0)