Skip to content

v1.9.24

Choose a tag to compare

@github-actions github-actions released this 13 May 11:18
· 5 commits to main since this release

Fix Full mode timeout cascade — `batch header read honors request_timeout_secs` (#1088, PR #1108 by @dazzling-no-more). در Full mode، یک Apps Script edge کند، تمام تونل sessionهای hot-and-flowing رو cascade-kill می‌کرد. کاربرها روی v1.9.21+ مرتب 10s "batch timeout" می‌دیدن و download progress تلگرام/browser رو از دست می‌دادن. Root cause: `read_http_response` در `domain_fronter.rs` یک hardcoded 10s header-read timeout داشت که داخل `tunnel_batch_request_to` اجرا می‌شد — مستقل از و کوتاه‌تر از outer `tokio::time::timeout(batch_timeout, ...)` در `fire_batch`. Apps Script cold starts معمولاً 8-12s طول می‌کشن (PR #1040's A/B 4/30 H1 batches رو ثبت کرد که دقیقاً 10s timeout می‌شدن)، پس inner cliff به‌عنوان false-positive batch timeout قبل از اینکه `request_timeout_secs` (default 30s) trigger بشه fire می‌شد. Fix: (1) `tunnel_batch_request_to` حالا `batch_timeout` رو به header read pass می‌کنه via new `read_http_response_with_header_timeout` helper. (2) Header read یک absolute deadline استفاده می‌کنه (`timeout_at`) به جای per-read `timeout()` — slow drip-feed peer دیگه نمی‌تونه silently extend بزنه. (3) Bonus: `TunnelMux::reply_timeout` با `batch_timeout` co-vary می‌کنه (`batch_timeout + 5s slack`). ۲۰۹ → ۲۳۱ lib test (+22 regression).
Docker: cargo-chef برای build بدون BuildKit (#620, PR #1117 by @dazzling-no-more). `tunnel-node/Dockerfile` از BuildKit-only `RUN --mount=type=cache` استفاده می‌کرد که روی Cloud Run's `gcloud run deploy --source .` path شکست می‌خورد (underlying `gcr.io/cloud-builders/docker` builder BuildKit رو enable نمی‌کنه). cargo-chef pattern: `recipe.json` planner stage + `cargo chef cook` deps stage + final build with `src/` on top. Docker's regular layer cache حالا dependency reuse رو handle می‌کنه — warm rebuilds تنها `src/` رو compile می‌کنن. Base bump `rust:1.85-slim` → `rust:1.90-slim` (cargo-chef نیاز به rustc 1.86+ داره).

Fix Full mode timeout cascade — batch header read honors request_timeout_secs (#1088, PR #1108 by @dazzling-no-more). Under Full mode, a single slow Apps Script edge cascade-killed every in-flight tunnel session sharing its batch. Users on v1.9.21+ saw frequent 10s "batch timeout" errors and lost download progress on Telegram / browser sessions.

Root cause: read_http_response in domain_fronter.rs had a hardcoded 10s header-read timeout that ran inside tunnel_batch_request_to — independent of and shorter than the outer tokio::time::timeout(batch_timeout, …) in fire_batch. Apps Script cold starts routinely land in the 8-12s range (PR #1040's A/B recorded 4/30 H1 batches timing out at exactly 10s after the H2→H1 switch), so the inner cliff fired as a false-positive batch timeout well before request_timeout_secs (default 30s) could.

Fix (in domain_fronter.rs + tunnel_client.rs):

  1. tunnel_batch_request_to passes batch_timeout to the header read via new read_http_response_with_header_timeout helper. Config::request_timeout_secs is now the only knob controlling how long we wait for an Apps Script edge to start responding. Other callers (relay path, exit-node) keep the historical 10s value.
  2. Header read uses a single absolute deadline (timeout_at) instead of per-read timeout(). Total elapsed across all header reads is bounded regardless of read cadence — a slow drip-feed peer can no longer silently extend.
  3. TunnelMux::reply_timeout co-varies with batch_timeout (computed at construction as fronter.batch_timeout() + 5s slack instead of fixed 35s const). Operators raising request_timeout_secs no longer have sessions abandon reply_rx just before fire_batch's HTTP round-trip would complete.

209 → 231 lib tests (+22 regression covering the deadline/co-variance behavior).

Docker: cargo-chef so tunnel-node builds without BuildKit (#620, PR #1117 by @dazzling-no-more). tunnel-node/Dockerfile used BuildKit-only RUN --mount=type=cache directives, breaking on Cloud Run's gcloud run deploy --source . path (the underlying gcr.io/cloud-builders/docker builder doesn't enable BuildKit, and --set-build-env-vars DOCKER_BUILDKIT=1 doesn't flip it on either).

Reworked to use cargo-chef: a dedicated planner stage emits recipe.json for dependency metadata, a cargo chef cook stage builds just the deps in their own Docker layer, the final build stage adds src/ on top. Docker's regular layer cache handles dependency reuse — warm rebuilds where only src/ changes still skip the slow crate compile.

Base bump rust:1.85-slimrust:1.90-slim (cargo-chef's transitive deps require rustc 1.86+; tunnel-node's Cargo.toml has no rust-version pin so the bump is internal-only).

Action for Cloud Run users blocked on #620: pull v1.9.24 of the tunnel-node Docker image (ghcr.io/therealaleph/mhrv-tunnel-node:v1.9.24 or :latest) — your gcloud run deploy --source . should now succeed without BuildKit.

Followup: issue #1131 (BuffOvrFlw) reports h1 open timed out after 8s — that's the H1_OPEN_TIMEOUT_SECS = 8 from PR #1029 firing on open() (TCP+TLS handshake), separate from the header-read timeout this release fixes. Worth a follow-up PR to make H1_OPEN_TIMEOUT_SECS parameterized via request_timeout_secs too.

What's Changed

Full Changelog: v1.9.23...v1.9.24