feat: add speech-to-text transcription with OpenAI Whisper#2
Conversation
Add voice input support to AI Chat using dual-mode approach: - Chrome: Web Speech API (real-time, no backend) - Firefox/Safari: MediaRecorder + OpenAI Whisper via /api/transcribe endpoint Also adds dev server configs (.claude/launch.json), removes broken device-utils test, and documents speech input in README. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughThis PR introduces a speech-to-text feature for AI Chat with dual-browser support, adds comprehensive codebase planning documentation covering architecture, conventions, concerns, integrations, stack and structure, establishes development launch configurations for Bun, and removes a test file. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
POST /api/transcribeendpoint using AI SDK'sexperimental_transcribe()withwhisper-1model — accepts audio FormData, returns transcribed textonAudioRecordedcallback in chat page to connect theSpeechInputcomponent to the backend transcription service.planning/codebase/), dev server configs (.claude/launch.json), and README documentation for speech input featuredevice-utils.test.tsthat imported non-existent exportsTest plan
/api/transcribe→ text appears in input🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes