Skip to content

fix: detect WebM audio format in duration check for web display recording#71

Open
tjaffri wants to merge 1 commit intoPiSugar:masterfrom
tjaffri:fix/webm-audio-duration-detection
Open

fix: detect WebM audio format in duration check for web display recording#71
tjaffri wants to merge 1 commit intoPiSugar:masterfrom
tjaffri:fix/webm-audio-duration-detection

Conversation

@tjaffri
Copy link
Copy Markdown

@tjaffri tjaffri commented May 9, 2026

Summary

  • Detects WebM files by checking the magic number (0x1a45dfa3)
  • Uses ffprobe as fallback for WebM and unknown audio formats
  • Adds graceful fallback for files with content but unparseable duration

Problem

When using the web display with WEB_AUDIO_ENABLED=true, browsers record audio in WebM format. However, these files are saved with .mp3 extension, causing the duration check to fail since mp3Duration cannot parse WebM files.

This results in all voice recordings being rejected as "too short":

Record audio too short, skipping recognition.

Solution

The fix detects WebM format by checking the file's magic number and uses ffprobe (from ffmpeg) to get the actual duration. This is consistent with the existing documentation in .env.template which already mentions ffmpeg as a requirement for web audio:

Note: for ASR servers that require WAV/MP3 (e.g. vosk, local whisper), ffmpeg must be available on the host to convert the browser's webm/opus recording automatically.

Test plan

  • Tested with web display on macOS
  • Verified voice recording works with browser microphone
  • Verified OpenAI ASR correctly transcribes WebM audio
  • Confirmed fallback works when ffprobe is not available

🤖 Generated with Claude Code

…ding

When using the web display with WEB_AUDIO_ENABLED=true, browsers record
audio in WebM format. However, these files are saved with .mp3 extension,
causing the duration check to fail (mp3Duration can't parse WebM).

This resulted in all voice recordings being rejected as "too short".

This change:
- Detects WebM files by checking the magic number (0x1a45dfa3)
- Uses ffprobe as fallback for WebM and unknown formats
- Adds graceful fallback for files with content but unparseable duration

Requires ffmpeg to be installed for WebM duration detection, which is
already documented as a requirement for web audio in .env.template.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant