Skip to content

Conversation

@unlox775-code-dot-org
Copy link

@unlox775-code-dot-org unlox775-code-dot-org commented Aug 27, 2025

Summary

This PR fixes the issue where Whispering's audio feedback sounds hijack macOS media controls. Previously, when Whispering played completion sounds, pressing the play/pause button on your keyboard would repeat the Whispering sound instead of controlling your original media (Spotify, Apple Music, etc.).

Type of Change

  • feat: New feature
  • fix: Bug fix
  • docs: Documentation update
  • refactor: Code refactoring (no functional changes)
  • perf: Performance improvement
  • test: Test additions or changes
  • chore: Maintenance tasks
  • style: Code style changes

Related Issue

Closes #722

Changes Made

Problem

When Whispering plays audio feedback sounds (like the "ta-da" completion sound), it hijacks the system's media controls. This means pressing the play/pause button on your keyboard repeats the Whispering sound instead of controlling your original media (Spotify, Apple Music, etc.).

Root Cause

HTML5 <audio> elements automatically register with macOS's media control system when they play, making the web app the "now playing" application.

Solution

Replaced HTML5 audio elements with Web Audio API, which plays audio without interfering with media controls.

Before:

const audio = new Audio('sound.mp3');
audio.play(); // Hijacks media controls!

After:

const context = new AudioContext();
const source = context.createBufferSource();
source.buffer = audioBuffer;
source.connect(context.destination);
source.start(); // No media control interference!

Implementation

  • New Web Audio API service (src/lib/services/sound/web-audio.ts)
  • Updated sound service to use Web Audio API instead of HTML5 audio
  • Audio caching for better performance
  • No framework dependencies - pure web standards

Testing

Desktop App Testing

  • Tested on macOS
  • Tested on Windows
  • Tested on Linux
  • Not applicable (web-only change)

General Testing

  • Tested with multiple API providers (if applicable)
  • Verified no API keys are exposed in logs or storage
  • Checked for console errors
  • Tested on different screen sizes (if UI change)

Feature-Specific Testing

Verified that:

  • Audio feedback sounds work correctly
  • Media controls remain functional
  • User experience is unchanged
  • Works consistently across scenarios

Checklist

  • My code follows the project's coding standards (see CONTRIBUTING.md)
  • I've used type instead of interface in TypeScript
  • I've used absolute imports where applicable
  • I've tested my changes thoroughly
  • I've added/updated tests for my changes (if applicable)
  • My changes don't break existing functionality
  • I've updated documentation (if needed)

Screenshots/Recordings

No UI changes - this is a backend audio implementation change that fixes media control hijacking without affecting the user interface.

Additional Notes

Benefits

  • ✅ Audio feedback no longer hijacks media controls
  • ✅ Better performance with cached audio buffers
  • ✅ Cleaner architecture using standard web APIs
  • ✅ Cross-platform compatibility
  • ✅ No complex framework integration to maintain

Files Changed

  • apps/whispering/src/lib/services/sound/web-audio.ts - New Web Audio API implementation
  • apps/whispering/src/lib/services/sound/desktop.ts - Updated to use Web Audio API
  • docs/specs/20250121T143000-macos-media-control-integration.md - Updated specification

@unlox775-code-dot-org unlox775-code-dot-org marked this pull request as ready for review August 28, 2025 00:30
@unlox775-code-dot-org
Copy link
Author

I found an issue that when you leave the Whisper app running for a long time, eventually the audio system falls down. The app still runs just fine when you hit the shortcut key to record and finish the recording, but it doesn't play any audio out loud until you close and reopen the app. I am working to debug this before I request a review again.

@unlox775-code-dot-org
Copy link
Author

I've worked out the few bugs that I have been able to find. I don't have a Windows machine to test on or Linux, but I think that this is ready to go.

Issue I ran into was that the WebAudio library I was using would somehow lose its connections to the output speakers over time so it would work for a couple hours and as soon as you put your computer to sleep or woke it up it would not be able to play any more sound out everything still runs is fine it would just not output sound. This is fixed now because I just create that web audio context each time before playing and when finished playing the sounds.

@unlox775-code-dot-org unlox775-code-dot-org marked this pull request as ready for review October 22, 2025 21:09
@unlox775-code-dot-org unlox775-code-dot-org changed the title Fix to make played sounds not interrupt the MacOS media center fix(sound): fix audio feedback hijacking macOS media controls Oct 22, 2025
@Techie5879
Copy link

^ This is a really persistent problem that makes me unable to use whispering. is there a known temp fix or can a fix be pushed for this?

@unlox775-code-dot-org
Copy link
Author

I have been running with this change locally on my machine for weeks and it works flawlessly to Still plays the start and end recording sound. And for an extra bonus, I added this feature, which actually pauses whatever other music is playing and then automatically plays again once it's complete. Check it out: #911

I am just waiting for the maintainers to review this PR and merge it in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make the audio bop-pop sound not connect to the system's play and pause media framework

2 participants