Skip to content

Stream batch audio uploads#5571

Merged
ComputelessComputer merged 1 commit into
mainfrom
fix/stream-batch-audio-uploads
Jun 12, 2026
Merged

Stream batch audio uploads#5571
ComputelessComputer merged 1 commit into
mainfrom
fix/stream-batch-audio-uploads

Conversation

@ComputelessComputer

@ComputelessComputer ComputelessComputer commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary:

  • stream sync batch request bodies into temp files instead of extracting whole-body Bytes
  • reuse the temp file path across provider retries and hyprnote routing attempts
  • stream Deepgram raw uploads and OpenAI multipart file parts from disk

Verification:

  • cargo check -p transcribe-proxy -p owhisper-client
  • cargo test -p transcribe-proxy routes::batch
  • cargo test -p owhisper-client adapter::deepgram::batch
  • cargo test -p owhisper-client adapter::openai::batch
  • git diff --check

Note

Medium Risk
Changes memory and I/O behavior on the main batch transcription path; incorrect streaming, temp file handling, or size limits could affect large uploads or retries, though logic is largely a refactor with tests added.

Overview
Batch sync transcription no longer buffers the full request body in memory. The transcribe-proxy batch handler accepts an Axum Body, streams chunks into a temp file (with MAX_BATCH_AUDIO_BODY_BYTES enforcement and clearer error responses), and passes that single file path through hyprnote routing, retries, and provider calls. Callback requests still buffer to Bytes for JSON parsing.

owhisper-client adds shared streaming_file_body / streaming_file_part helpers (reqwest stream feature). Deepgram batch uploads stream from disk with Content-Length; OpenAI multipart file parts stream from disk instead of loading the whole file into memory. Async callback sync fallback writes downloaded audio to a temp file before transcribing, matching the path-based API.

Reviewed by Cursor Bugbot for commit 418a5fd. Bugbot is set up for automated code reviews on this repo. Configure here.

@netlify

netlify Bot commented Jun 12, 2026

Copy link
Copy Markdown

Deploy Preview for old-char canceled.

Name Link
🔨 Latest commit 418a5fd
🔍 Latest deploy log https://app.netlify.com/projects/old-char/deploys/6a2bde56f520d40008ac31c5

Write batch request bodies to temp files incrementally and stream Deepgram/OpenAI provider uploads from disk instead of rematerializing audio buffers.
@ComputelessComputer ComputelessComputer force-pushed the fix/stream-batch-audio-uploads branch from 9329149 to 418a5fd Compare June 12, 2026 10:24
@ComputelessComputer ComputelessComputer merged commit 0876e82 into main Jun 12, 2026
17 checks passed
@ComputelessComputer ComputelessComputer deleted the fix/stream-batch-audio-uploads branch June 12, 2026 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant