feat: add basic audio support with voice recording and TTS by PaulLampe · Pull Request #20 · wahl-chat/wahl-chat-app

PaulLampe · 2025-12-29T14:52:14Z

Summary

Add voice recording to send audio messages that get transcribed
Add text-to-speech (TTS) playback for assistant messages
Integrate audio controls into chat input and message components

Additional changes

Applied formatting to all existing files
Added pre-commit hook via Husky for formatting and linting
Minor import organization fixes

add: Contributions welcome to readme

Develop

fix: forgotten description in root layout metadata

Develop

- Add voice recording functionality - Add text-to-speech (TTS) playback for messages

…e recordings

vercel · 2025-12-29T14:52:18Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
web	Ready	Preview, Comment	Dec 29, 2025 2:54pm

Copilot

Pull request overview

This PR adds comprehensive audio support to the chat application, enabling users to send voice messages that get automatically transcribed and receive text-to-speech playback of assistant responses. The implementation includes client-side voice recording, WebSocket-based audio streaming, Firebase persistence, and integrated UI controls.

Key changes:

Voice recording with microphone input and automatic transcription via WebSocket
Text-to-speech (TTS) playback system for assistant messages with play/pause controls
Enhanced message ID generation to maintain consistency between client state and Firebase

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 23 comments.

Show a summary per file

File	Description
lib/stores/chat-store.types.ts	Added types for TTS state, voice transcription status, and helper function for TTS key generation
lib/stores/chat-store.ts	Integrated new audio-related actions into the chat store
lib/stores/actions/voice-transcription-actions.ts	Implements state management for voice transcription lifecycle (pending, transcribed, error)
lib/stores/actions/tts-actions.ts	Manages TTS state transitions and WebSocket requests for audio generation
lib/stores/actions/send-voice-message.ts	Handles sending voice messages with Firebase persistence and WebSocket communication
lib/stores/actions/complete-streaming-message.ts	Updated to accept optional message ID from backend for message tracking
lib/stores/actions/chat-add-user-message.ts	Enhanced to use consistent message IDs between client and Firebase
lib/socket.types.ts	Added payload types for TTS requests/responses and voice transcription
lib/hooks/use-voice-recorder.ts	Custom hook for MediaRecorder API integration with permission handling
lib/hooks/use-tts-audio.ts	Custom hook managing TTS audio playback lifecycle and state
lib/firebase/firebase.ts	Added functions for voice transcription updates and message ID generation
lib/chat-socket.ts	Extended socket events for TTS requests and voice transcription
components/sticky-input.tsx	Integrated voice recording button with conditional rendering logic
components/providers/socket-provider.tsx	Added handlers for TTS and voice transcription WebSocket events
components/home/home-input.tsx	Implements voice message capture with sessionStorage handoff to chat page
components/dynamic-rate-limit-sticky-input.tsx	Propagates voice message callback through component hierarchy
components/chat/voice-record-button.tsx	UI components for voice recording button and recording indicator
components/chat/chat-view.tsx	Added voice message flag support for SSR
components/chat/chat-view-ssr.tsx	Passes voice message flag to client components
components/chat/chat-tts-button.tsx	TTS control button component with loading and playing states
components/chat/chat-single-user-message.tsx	Enhanced to display voice transcription status with pending/error/success states
components/chat/chat-single-message.tsx	Propagates voice transcription status to child components
components/chat/chat-single-message-actions.tsx	Integrates TTS button into message action bar
components/chat/chat-messages-view.tsx	Processes pending voice messages from sessionStorage on page load
components/chat/chat-grouped-messages.tsx	Passes voice transcription status to message renderer
components/chat/chat-input.tsx	Adds voice recording capability to main chat input
app/session/page.tsx	Handles voice message URL parameter and metadata generation fix
app/(with-header)/share/page.tsx	Fixed metadata generation to handle null/undefined cases
.eslintrc.json	Applied formatting to ESLint configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-06T10:34:36Z

lib/stores/actions/send-voice-message.ts

+      return;
+    }
+
+    const voiceMessagePlaceholder = '[Sprachnachricht]';


The placeholder message '[Sprachnachricht]' uses hardcoded German text with square brackets. This creates inconsistency with the voice transcription status UI which properly indicates pending status. Consider using a more consistent approach or internationalized text.

Suggested change

const voiceMessagePlaceholder = '[Sprachnachricht]';

const voiceMessagePlaceholder = '';

Copilot · 2026-01-06T10:34:36Z

components/chat/chat-messages-view.tsx

+  useEffect(() => {
+    if (
+      !hasPendingVoiceMessage ||
+      hasProcessedVoiceMessage.current ||
+      !isSocketConnected
+    )
+      return;
+
+    const pendingAudioBase64 = sessionStorage.getItem(
+      PENDING_VOICE_MESSAGE_KEY,
+    );
+    if (pendingAudioBase64) {
+      sessionStorage.removeItem(PENDING_VOICE_MESSAGE_KEY);
+      hasProcessedVoiceMessage.current = true;
+      // Convert base64 back to Uint8Array
+      const binaryString = atob(pendingAudioBase64);
+      const audioBytes = new Uint8Array(binaryString.length);
+      for (let i = 0; i < binaryString.length; i++) {
+        audioBytes[i] = binaryString.charCodeAt(i);
+      }
+      sendVoiceMessage(audioBytes);
+    }
+  }, [hasPendingVoiceMessage, sendVoiceMessage, isSocketConnected]);


Potential race condition when processing the pending voice message. The useEffect checks hasProcessedVoiceMessage.current but another component or re-render could also process the same message if isSocketConnected changes multiple times. Consider using a more robust state management approach, such as moving this flag into the chat store state rather than a component-level ref.

Copilot · 2026-01-06T10:34:37Z

lib/hooks/use-voice-recorder.ts

+    } catch (err) {
+      if (err instanceof DOMException && err.name === 'NotAllowedError') {
+        setPermissionStatus('denied');
+        setError('Mikrofonzugriff wurde verweigert.');
+      } else {
+        setError('Fehler beim Starten der Aufnahme.');
+      }
+    }


The MediaStream tracks are not stopped when an error occurs during recording setup. If getUserMedia succeeds but MediaRecorder creation fails, the camera/microphone will remain active. Consider adding stream.getTracks().forEach(track => track.stop()) in the error handler to properly release media resources.

Copilot · 2026-01-06T10:34:37Z

lib/hooks/use-tts-audio.ts

+    } else if (ttsState.status === 'ready' && ttsState.audioBase64) {
+      const audio = new Audio(`data:audio/mp3;base64,${ttsState.audioBase64}`);
+      audioRef.current = audio;
+      audio.play().catch(console.error);


The audio play error is silently caught with console.error (line 47). Users won't know if audio playback fails. Consider providing user feedback when audio fails to play, such as displaying a toast notification or error message.

Copilot · 2026-01-06T10:34:37Z

lib/hooks/use-tts-audio.ts

+    } else if (ttsState.status === 'ready' && ttsState.audioBase64) {
+      const audio = new Audio(`data:audio/mp3;base64,${ttsState.audioBase64}`);
+      audioRef.current = audio;
+      audio.play().catch(console.error);
+      setTtsPlaying(partyId, messageId);


Multiple Audio instances can be created for the same message. Each time the play function is called with status 'ready', a new Audio object is created (line 45) without cleaning up the previous one. This can lead to:

Memory leaks from unreleased Audio objects

Multiple audio tracks playing simultaneously if play is called multiple times quickly

Consider checking if audioRef.current already exists before creating a new Audio instance, or cleaning up the previous instance first.

Copilot · 2026-01-06T10:34:40Z

components/chat/chat-messages-view.tsx

+      const binaryString = atob(pendingAudioBase64);
+      const audioBytes = new Uint8Array(binaryString.length);
+      for (let i = 0; i < binaryString.length; i++) {
+        audioBytes[i] = binaryString.charCodeAt(i);
+      }
+      sendVoiceMessage(audioBytes);


The base64 encoding/decoding of audio data could fail with large files or non-standard characters, but there's no error handling. If atob() fails (line 95) due to invalid base64, it will throw an uncaught exception. Consider wrapping the conversion in a try-catch block and showing an appropriate error message to the user.

Suggested change

const binaryString = atob(pendingAudioBase64);

const audioBytes = new Uint8Array(binaryString.length);

for (let i = 0; i < binaryString.length; i++) {

audioBytes[i] = binaryString.charCodeAt(i);

}

sendVoiceMessage(audioBytes);

try {

const binaryString = atob(pendingAudioBase64);

const audioBytes = new Uint8Array(binaryString.length);

for (let i = 0; i < binaryString.length; i++) {

audioBytes[i] = binaryString.charCodeAt(i);

}

sendVoiceMessage(audioBytes);

} catch (error) {

// Handle invalid or corrupted base64 audio data gracefully

console.error('Failed to decode pending voice message audio data:', error);

if (typeof window !== 'undefined') {

window.alert?.('Unable to process the pending voice message. Please try recording again.');

}

}

Copilot · 2026-01-06T10:34:41Z

components/home/home-input.tsx

+    const base64 = btoa(String.fromCharCode(...audioBytes));
+    sessionStorage.setItem(PENDING_VOICE_MESSAGE_KEY, base64);


The base64 encoding of audio bytes using btoa with String.fromCharCode may not handle binary data correctly for all byte values, especially values greater than 255. This could corrupt the audio data. Consider using a more robust base64 encoding method such as Buffer.from(audioBytes).toString('base64') or a dedicated base64 encoding library.

Copilot · 2026-01-06T10:34:41Z

lib/firebase/firebase.ts

    messages: [
      {
-        id: generateUuid(),
+        id: options?.messageId ?? generateMessageId(sessionId),


The function is calling generateMessageId for the inner messageId (line 258) even though this is meant to generate a grouped message ID, not an inner message ID. The function comment on line 233-234 states it generates a Firebase document ID for a message in a chat session, but it's being used for both grouped messages and individual messages within those groups. This could lead to confusion or incorrect ID generation. Consider creating separate functions like generateGroupedMessageId and generateInnerMessageId for clarity, or update the documentation to clarify it works for both levels.

Copilot · 2026-01-06T10:34:41Z

lib/hooks/use-voice-recorder.ts

+      const mediaRecorder = new MediaRecorder(stream, {
+        mimeType: 'audio/webm;codecs=opus',
+        audioBitsPerSecond: 32000,


Hardcoded MIME type 'audio/webm;codecs=opus' may not be supported on all browsers, particularly Safari which has limited WebM support. Consider adding fallback MIME type detection or using a more universally supported format. You could check MediaRecorder.isTypeSupported() to verify browser compatibility before using this format.

Copilot · 2026-01-06T10:34:41Z

lib/stores/actions/send-voice-message.ts

+      toast.error('wahl.chat ist nicht verbunden.');
+      return;
+    }
+
+    if (!userId) {
+      toast.error('Benutzer nicht authentifiziert.');


The hardcoded error messages are in German. These should be internationalized or use a consistent language approach with the rest of the codebase. For consistency, consider using a key-based approach or ensure all user-facing messages follow the same internationalization pattern.

RobinFrasch and others added 12 commits November 16, 2025 21:28

Merge pull request #2 from wahl-chat/develop

a73a220

add: Contributions welcome to readme

Merge pull request #3 from wahl-chat/develop

8e5ceee

Develop

Merge pull request #4 from wahl-chat/develop

bb3e189

fix: forgotten description in root layout metadata

Merge pull request #8 from wahl-chat/develop

18bf773

Develop

chore: fix linting and formatting

078f4f6

chore: add .idea to gitignore

8132e73

chore: add pre-commit hook for formatting and linter

de159d0

feat: add basic audio support with voice recording and TTS

92814cd

- Add voice recording functionality - Add text-to-speech (TTS) playback for messages

chore: reorder imports

8c74d24

Merge branch 'feat/formatting' into feat/basic-audio-support

3281b14

fix: unify chat and voice messages and switch to byte arrays for voic…

4ed954b

…e recordings

chore: move firebase updates for transcription to nextjs app

6d4ae3c

vercel bot deployed to Preview – web December 29, 2025 14:54 View deployment

PaulLampe requested review from Antonwy and romman8 January 4, 2026 21:57

romman8 changed the base branch from feat/formatting to develop January 6, 2026 08:33

vercel bot deployed to Preview – embed January 6, 2026 08:39 View deployment

vercel bot had a problem deploying to Preview – web-embedding-example January 6, 2026 08:40 Failure

romman8 assigned PaulLampe Jan 6, 2026

romman8 requested a review from Copilot January 6, 2026 10:27

romman8 added the enhancement New feature or request label Jan 6, 2026

Copilot started reviewing on behalf of romman8 January 6, 2026 10:28 View session

romman8 mentioned this pull request Jan 6, 2026

feat: add basic audio support with voice recording and TTS #10

Closed

Copilot AI reviewed Jan 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add basic audio support with voice recording and TTS#20

feat: add basic audio support with voice recording and TTS#20
PaulLampe wants to merge 12 commits intodevelopfrom
feat/basic-audio-support

PaulLampe commented Dec 29, 2025

Uh oh!

vercel bot commented Dec 29, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Copilot AI Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	const voiceMessagePlaceholder = '[Sprachnachricht]';
	const voiceMessagePlaceholder = '';

		const base64 = btoa(String.fromCharCode(...audioBytes));
		sessionStorage.setItem(PENDING_VOICE_MESSAGE_KEY, base64);

Conversation

PaulLampe commented Dec 29, 2025

Summary

Additional changes

Uh oh!

vercel bot commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

vercel bot commented Dec 29, 2025 •

edited

Loading