This feature automatically detects when no audio is being captured during a transcription session and pauses transcription after 4 consecutive silent chunks, alerting the MCP client (Cursor or Claude Desktop) about the issue.
Added isSilentAudio() function that:
- Analyzes PCM audio buffers for amplitude
- Uses configurable threshold (default: 100)
- Samples every 100th value for efficiency
- Handles both positive and negative amplitudes
- Returns
trueif audio is below threshold
Extended TranscriptionStatus interface with:
consecutiveSilentChunks?: number- Tracks consecutive silent chunksisPaused?: boolean- Indicates if transcription is pausedwarning?: string- Contains warning message when paused
Enhanced TranscriptionSession class with:
SILENCE_THRESHOLD = 4- Number of consecutive silent chunks before pausingSILENCE_AMPLITUDE_THRESHOLD = 100- Amplitude threshold for silence detection
- Checks each WAV chunk for silence before transcription
- Increments
consecutiveSilentChunkscounter for silent chunks - Resets counter when real audio is detected
- Pauses after threshold is reached
- Sets descriptive warning message
resume()- Resumes transcription after pause, resets counters and warning
- Skips transcription of silent chunks (saves API calls)
- Automatically pauses when silence threshold is reached
- Logs all silence events to debug log
- Resumes automatically when audio is detected (if not manually paused)
- Allows manual resume after auto-pause
- Validates session state
- Returns success/error message
Now includes:
{
"isRunning": true,
"isPaused": false,
"consecutiveSilentChunks": 0,
"warning": "Audio capture appears to be inactive...",
"chunksProcessed": 10,
"errors": 0,
"outputFile": "/path/to/transcript.md"
}Test coverage includes:
- Empty buffer detection
- Silent audio (all zeros)
- Low amplitude detection
- High amplitude detection
- Mixed audio (silence + loud samples)
- Custom threshold handling
- Negative amplitude handling
- Session status tracking
- Resume functionality
- Type validation
Clients should poll get_status periodically to check:
const status = await mcpClient.callTool("get_status", {});
if (status.isPaused && status.warning) {
// Alert user about audio capture issue
console.error(status.warning);
// Suggest troubleshooting steps
}Once audio input is fixed:
const result = await mcpClient.callTool("resume_transcription", {});
// result: { success: true, message: "Transcription resumed..." }When paused, the warning message is:
Audio capture appears to be inactive. No audio detected for 4 consecutive chunks.
Transcription paused. Please check your audio input device and routing.
- Cost Savings: Skips transcription of silent chunks
- User Awareness: Alerts when audio isn't being captured
- Automatic Recovery: Resumes when audio is detected
- Debug Support: Logs all silence events for troubleshooting
- Configurable: Threshold and amplitude settings are constants that can be tuned
To adjust sensitivity, modify in TranscriptionSession:
private readonly SILENCE_THRESHOLD = 4; // Increase for more tolerance
private readonly SILENCE_AMPLITUDE_THRESHOLD = 100; // Increase for stricter detectionAll silence events are logged to ~/.audio-transcription-mcp-debug.log:
- Silent chunk detections with counter
- Pause events with warning message
- Resume events
- Audio detection after silence
- User starts transcription
- Audio device misconfigured or no audio playing
- First silent chunk detected:
consecutiveSilentChunks = 1 - Second silent chunk:
consecutiveSilentChunks = 2 - Third silent chunk:
consecutiveSilentChunks = 3 - Fourth silent chunk:
consecutiveSilentChunks = 4→ PAUSED isPaused = true, warning message set- MCP client polls status, sees warning, alerts user
- User fixes audio routing
- Either:
- User calls
resume_transcriptionmanually, OR - Audio automatically resumes when detected (counter resets)
- User calls
- ✅ Works with Cursor MCP support
- ✅ Works with Claude Desktop
- ✅ Works with any MCP-compatible client
- ✅ Backward compatible (new fields are optional)