feat(download): resume via Range header on transient stream failures by rubenhensen · Pull Request #52 · encryption4all/postguard-js

rubenhensen · 2026-05-07T18:45:26Z

Implements the resumable-download design from #48 — incorporating the review feedback @dobby-coder posted on the issue, which pushed back hard (and correctly) on the original "wrap withRetry" sketch in the issue body.

Shape

Source-owned retry loop inside the outer ReadableStream. withRetry's Promise<T> shape resolves before the stream is read, leaving mid-stream errors with no way to re-enter the loop. The new driver lives in the stream's start() and turns each underlying GET into one attempt. Same backoff helpers — different driver.
No tee() — bytes flow through the source's own read() → enqueue() path, so received += value.byteLength at the single forwarding site.
Strict 206 + Content-Range check on resume. Range is a request hint, not a contract. Caching proxies and misconfigured CDNs can ignore it and return 200 OK with the full body from byte 0; splicing that onto a consumer at offset N corrupts the file silently. Resume asserts both status === 206 and that Content-Range's start byte equals the requested offset — either mismatch is classified as fail-not-retry (the upstream isn't going to start honouring Range on the next try).
Attempt counter shared across resumes. A flapping connection that delivers some bytes per attempt now exhausts the budget instead of looping forever.
Reuses existing classifyCryptifyError for stream-level errors — confirmed by test rather than pre-emptively changed.
Stall-watchdog explicitly out of scope (separate axis, would balloon the change).

Public-API change

downloadFileWithRetry now returns the ReadableStream synchronously (no Promise wrapper). Stream-level errors propagate to the consumer's first failing read() call rather than via the function's promise. The single existing call site in src/crypto/decrypt.ts:45 works unchanged because Promise.all accepts non-thenable values.

This is a behavioural shift only for callers that did await downloadFileWithRetry(...) and caught a synchronous rejection for non-stream-level errors (e.g. 404). With the new shape, those errors surface on the consumer's first read() call, which is the right place — they're stream-level events, not function-result events.

Test plan

Refs

Closes feat(download): resume downloads via Range on transient failure #48
Pairs with Upload and download sessions are not robust cryptify#136 (cryptify already supports Range natively, no server-side change needed)
Builds on feat(upload): retry chunks and downloads with exponential backoff #47 (retry/backoff foundation — RetryOptions, classifyCryptifyError, jitter helpers reused)
Upload fails with 404 on chunk PUT — no retry, entire upload lost postguard-website#117 (root robustness issue)

Out of scope (tracked separately)

Cross-refresh download resume (persisting partial blob in IndexedDB so a refresh picks up where the previous session left off). Most downloads are short enough that in-session retry covers the common case.
Idle-stall detection (stall watchdog timer that resets on each enqueue) — separate axis from downloadTimeoutMs, deserves its own design.

Cryptify's `FileServer` already supports HTTP Range, so a stream-level failure mid-download (network drop, idle timeout) can now resume from the byte offset reached rather than restarting from zero. The consumer sees a single contiguous stream regardless of how many internal retries happened. Implements the design @dobby-coder proposed on #48, which pushes back on the original "wrap withRetry" sketch in the issue body. Key shape changes from that initial sketch: - **Source-owned retry loop inside the outer `ReadableStream`.** `withRetry`'s `Promise<T>` shape resolves before the stream is read, leaving mid-stream errors with no way to re-enter the loop. The new driver lives in the stream's `start()` and turns each underlying GET into one `attempt`. Same backoff helpers (`delayWithFullJitter`, `sleep`, `classifyCryptifyError`) — different driver, no parallel implementation. - **No `tee()`** for the byte counter. Bytes flow through the source's own `read()` → `enqueue()` path, so `received += value.byteLength` at the single forwarding site. One source of truth for the resume offset. - **Strict `206` + `Content-Range` check on resume.** Range is a request hint — caching proxies and misconfigured CDNs can ignore it and return `200 OK` with the full body from byte 0. Splicing that onto a consumer at offset N corrupts the file silently. Resume now asserts `status === 206` and that `Content-Range`'s start byte equals the requested offset; either mismatch is classified as fail-not-retry (the upstream isn't going to start honouring Range on the next try). - **Attempt counter shared across resumes.** A flapping connection that delivers some bytes per attempt now exhausts the budget instead of looping forever. Test pins this — one byte per attempt, three attempts, then errors. Public-API note: `downloadFileWithRetry` now returns the `ReadableStream` synchronously (no `Promise` wrapper). Stream-level errors propagate to the consumer's first failing `read()` call rather than via the function's promise. The single existing call site in `src/crypto/decrypt.ts:45` works unchanged because `Promise.all` accepts non-thenable values. Tests: - Clean download (1 attempt) — bytes match, no Range header on first GET. - 502 → 200 retry — fresh GET (no Range) since `received === 0`. - Mid-stream failure → 206 resume from offset — bytes contiguous. - 200 on resume → fail-not-retry (silent-rewind protection). - 206 with mismatched Content-Range start → fail-not-retry. - 404 on first read → no retry, propagates as NetworkError(404). - Flapping (1 byte per attempt × 3) — exhausts budget, errors after 3. - Caller-driven abort → AbortError, no retry. Helper test fix worth noting: `controller.error()` empties any queued chunks, so the test-only `streamThenError` is now pull-based to guarantee chunks are delivered before the error fires. 72 tests pass; type-check clean. Closes #48. Refs #117, #136, #145, #47.

rubenhensen · 2026-05-07T18:47:36Z

/dobby please review this implementation against your design feedback on #48. Key things to verify: the source-owned retry loop replaces withRetry on the download path, the strict 206 + Content-Range checks (incl. start-offset match), the shared attempt counter across resumes, and that classifyCryptifyError works as-is for stream-level TypeErrors. The public-API shift to a synchronous ReadableStream return is also worth a sanity check at the decrypt.ts:45 call site.

dobby-coder · 2026-05-07T18:50:27Z

Dobby is happy to help! Dobby will pull #48 to refresh on the design feedback, then check the source-owned retry loop, the strict 206 + Content-Range guards, the shared attempt counter across resumes, the classifyCryptifyError reuse, and the synchronous ReadableStream return at decrypt.ts:45. Dobby will run typecheck and tests too. Findings coming as a PR review!

dobby-coder

Review against design feedback on #48

Verified the implementation against each point of the design feedback. Built (npm run build), typechecked (tsc --noEmit), and ran tests (vitest run — 72 passed, including the 8 new download-retry cases) on the branch.

✅ 1. Source-owned retry loop replaces `withRetry` on the download path

downloadFileWithRetry (src/api/cryptify.ts:305-380) is now a synchronous factory that returns a ReadableStream whose start() owns the while(true) loop. Each underlying GET is one attempt. withRetry is no longer in the import-actually-used list for this path — confirmed by reading the file. The chunk-PUT path (storeChunkWithRetry) still uses withRetry, which is right: that one fits the Promise<T> shape.

This matches the §1 sketch from the design feedback essentially line-for-line, including reuse of classifyCryptifyError, delayWithFullJitter, sleep, and withTimeout.

✅ 2. Strict 206 + `Content-Range` checks (incl. start-offset match)

downloadRange (cryptify.ts:238-277) does the right thing in the right order:

!response.ok → NetworkError with the response status (so 5xx routes through to classifyCryptifyError === 'retry' and 4xx through to 'fail', which is correct).
response.status !== 206 (e.g. a 200 OK from a Range-stripping proxy) → NetworkError. Since status is a 2xx, classifyCryptifyError will see it as < 500 and classify as fail — exactly the silent-rewind protection §4 asked for.
parseContentRangeStart(contentRange) !== offset → also NetworkError, also fail-not-retry.

parseContentRangeStart (cryptify.ts:283-289) is tight: ^\s*bytes\s+(\d+)- — anything not matching the standard form returns null, which then mismatches offset and trips the assertion. Conservative and correct.

Both branches are covered by tests (tests/api.test.ts:684-734), and I confirmed the 200 → fails test exercises the offset path (after 2 bytes have been delivered) — corruption would be silent without this guard.

✅ 3. Shared attempt counter across resumes

let attempt = 0 is declared at cryptify.ts:312 — outside the while loop — and attempt += 1 runs unconditionally at the top of every iteration (cryptify.ts:320). A flapping connection that delivers one byte per attempt now exhausts the budget. The "shares the attempt counter across resumes" test (api.test.ts:753-827) constructs exactly that adversarial server (one byte per attempt → error) and asserts exactly 3 calls before the consumer's read() rejects. Solid.

✅ 4. `classifyCryptifyError` works as-is for stream-level `TypeError`s

Confirmed by reading src/util/retry.ts:68-79 — the existing classifier already returns 'retry' for TypeError. The test at api.test.ts:654-682 uses new TypeError('network reset') thrown from inside the stream's pull, and that error successfully drives a Range-based resume. No classifier changes needed, as the design feedback predicted (§3).

One small empirical note: in real fetch, mid-stream reads can also surface as DOMException with name === 'TypeError' in some browsers' historical implementations. The classifier's instanceof TypeError may not catch those. Cryptify SDK targets are modern enough that the standard TypeError branch covers the live cases, but it's worth being aware of if a future browser reverts. Not a blocker.

✅ 5. Public-API shift to synchronous `ReadableStream` return — decrypt.ts:45 sanity check

decrypt.ts:43-46:

const [vk, fileStream] = await Promise.all([
  fetchVerificationKey(pkgUrl, headers),
  downloadFileWithRetry(cryptifyUrl, uuid, retry, signal),
]);

Promise.all accepts non-thenable values and resolves them as-is, so this works unchanged. fileStream is still typed as ReadableStream<Uint8Array>. The PR body's claim that "404 now surfaces on the consumer's first read() rather than the function's promise" is the right framing — and the errors stream on first read with 404 (no retry) test (api.test.ts:736-751) pins that behaviour. StreamUnsealer.new(readable, vk) will see the error via the consumer's read() and reject from within WASM, which is consistent with how it already handles other stream errors.

The only thing this changes for the existing call site is where the error surfaces (consumer vs. caller of downloadFileWithRetry), not whether errors propagate. The single existing call site is fine.

Minor observations (non-blocking)

cancel() is a no-op for the in-flight fetch. cryptify.ts:370-378 documents this honestly. If a caller cancels the consumer-facing ReadableStream without aborting via signal, the underlying GET still runs to completion (its body is then dropped). This is a small resource leak per cancellation but matches what the design feedback implied (caller's signal is the primary abort mechanism). Could be tightened later by maintaining an internal AbortController combined with the caller's signal — out of scope here.
cleanup() is called both on success and in catch — that's fine because clearTimeout on a fired/cleared timer is a no-op. Worth a one-line comment if the file ever sees a refactor, but not needed now.
Stall watchdog still out of scope — the design feedback (§6) explicitly punted this, and the PR body confirms. Good discipline; tracking it separately is the right call.

Verdict

This implementation lands cleanly against the §1–§7 feedback from #48. No changes needed. Recommend marking ready and merging once any remaining manual smoke tests (StreamUnsealer over a spliced stream, killed-mid-download against local cryptify) check out.

rubenhensen · 2026-05-07T18:57:25Z

Thanks @dobby-coder for the thorough review and the approval. Acknowledging the three non-blocking observations:

cancel() no-op for in-flight fetch — agreed, deliberately out of scope. The caller-driven signal is the documented abort path; tightening this with an internal AbortController combined with the caller's signal is a cleanup worth its own PR.
cleanup() called both on success and in catch — clearTimeout is idempotent so this is intentional. Won't add a comment unless the file sees a wider refactor.
DOMException masquerading as TypeError in legacy browsers — noting it. The classifier's instanceof TypeError covers what fetch actually throws on modern browsers; if a future regression surfaces, we'll widen the branch.

Stall watchdog stays out — separate axis, will track when there's a concrete need.

No code changes, but I wanted to leave the trail visible on the PR for the next reviewer.

github-actions · 2026-05-07T19:01:45Z

🎉 This PR is included in version 1.5.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

rubenhensen · 2026-05-07T19:40:07Z

Thanks @dobby-coder for the approval — no changes needed on this PR. The three non-blocking observations (cancel() no-op for in-flight fetch, cleanup() called on both paths, stall watchdog out of scope) match my reading. Stall detection will be its own issue when there's a concrete need.

Picks up encryption4all/postguard-js#52 (resumable downloads via Range header on transient stream failures). Public-API change worth flagging for reviewers: `downloadFileWithRetry` now returns the ReadableStream synchronously rather than wrapped in a Promise — but the only consumer in this repo (via `Opened.decrypt` / `inspectSealed` in pg-js itself) is unaffected, since `Promise.all` accepts non-thenable values as-is. No website-side code change needed. `npm run check`, `lint`, and `build` clean.

rubenhensen · 2026-05-07T19:40:51Z

@rubenhensen all dobby checks green — ready for your review and merge when you've run the manual smoke tests called out in the test plan.

dobby-coder Bot approved these changes May 7, 2026

View reviewed changes

rubenhensen merged commit 95af57c into main May 7, 2026
2 checks passed

rubenhensen deleted the feat/resumable-downloads branch May 7, 2026 19:00

github-actions Bot added the released label May 7, 2026

This was referenced May 7, 2026

docs(postguard-js): document retry config, onRetry hook, resumable downloads, UploadSessionExpiredError encryption4all/postguard-docs#68

Closed

docs(sdk): retry config, resumable downloads, UploadSessionExpiredError encryption4all/postguard-docs#70

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(download): resume via Range header on transient stream failures#52

feat(download): resume via Range header on transient stream failures#52
rubenhensen merged 1 commit intomainfrom
feat/resumable-downloads

rubenhensen commented May 7, 2026

Uh oh!

rubenhensen commented May 7, 2026

Uh oh!

dobby-coder Bot commented May 7, 2026

Uh oh!

dobby-coder Bot left a comment

Uh oh!

rubenhensen commented May 7, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

rubenhensen commented May 7, 2026

Uh oh!

rubenhensen commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rubenhensen commented May 7, 2026

Shape

Public-API change

Test plan

Refs

Out of scope (tracked separately)

Uh oh!

rubenhensen commented May 7, 2026

Uh oh!

dobby-coder Bot commented May 7, 2026

Uh oh!

dobby-coder Bot left a comment

Choose a reason for hiding this comment

Review against design feedback on #48

✅ 1. Source-owned retry loop replaces withRetry on the download path

✅ 2. Strict 206 + Content-Range checks (incl. start-offset match)

✅ 3. Shared attempt counter across resumes

✅ 4. classifyCryptifyError works as-is for stream-level TypeErrors

✅ 5. Public-API shift to synchronous ReadableStream return — decrypt.ts:45 sanity check

Minor observations (non-blocking)

Verdict

Uh oh!

rubenhensen commented May 7, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

rubenhensen commented May 7, 2026

Uh oh!

rubenhensen commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

✅ 1. Source-owned retry loop replaces `withRetry` on the download path

✅ 2. Strict 206 + `Content-Range` checks (incl. start-offset match)

✅ 4. `classifyCryptifyError` works as-is for stream-level `TypeError`s

✅ 5. Public-API shift to synchronous `ReadableStream` return — decrypt.ts:45 sanity check