Skip to content

watchOS Compatibility for AIProxy #144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

richarddas
Copy link

watchOS Compatibility for AIProxy

Overview

This PR restores watchOS compatibility to AIProxy while maintaining full functionality on other platforms. The main focus was getting the library to compile and function on watchOS by addressing platform-specific limitations around audio capture and device information access.

Key Changes

1. Audio Session Configuration in AudioPCMPlayer

  • Modified the audio session configuration for different platforms
  • Added watchOS-specific implementation without .defaultToSpeaker option (not available on watchOS)
  • Added conditional Bluetooth support for watchOS 11+ using availability checks

2. Device Information in ClientLibErrorLogger

  • Replaced UIDevice usage on watchOS with generic values
  • Used "Apple Watch" for device model and "watchOS" for system name
  • Maintains the same functionality while avoiding reliance on unavailable APIs

3. Microphone Implementation for watchOS

  • Added conditional import for AudioToolbox (not available on watchOS)
  • Implemented an AVAudioEngine-based version of MicrophonePCMSampleVendor for watchOS
  • Ensured the same API interface for both implementations to maintain compatibility
  • Maintained the same output format (24000Hz PCM16 mono) for OpenAI compatibility

Testing

I've successfully integrated this fork into a watchOS app that communicates with OpenAI's API. The token protection functionality works correctly, and the microphone implementation compiles properly on watchOS, though audio capture functionality itself has not been tested.

Future Improvements

  • More comprehensive testing across all supported platforms
  • Potential optimizations for the watchOS audio implementation
  • Consider more robust error handling for watchOS-specific limitations

Motivation

As more developers use AIProxy for protecting API tokens in watchOS apps, these changes make it possible to use the library without complicated workarounds. This PR aims to expand the reach of AIProxy to watchOS while maintaining its security features.

I've made a first pass to get it compiling and working on watchOS, but welcome any feedback or requests for additional tests/changes before merging.

@@ -66,7 +66,7 @@ open class AudioPCMPlayer {
self.inputFormat = _inputFormat
self.playableFormat = _playableFormat

#if !os(macOS)
#if os(iOS) || os(visionOS)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about tvOS?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be implemented as:

  1. #if os(iOS) || os(tvOS) || os(visionOS)

but at the risk of enumerating every platform, it could also be implemented as:

  1. #if !os(macOS) && !os(watchOS)

Although 2 is harder to parse.

I haven't actually tested this on tvOS, so that would be the place to start.

@lzell
Copy link
Owner

lzell commented Apr 22, 2025

Thanks for this effort! I only skimmed it quickly so far. First question: Does the model hear its own playback with this method? One of the reasons I dropped to AudioToolkit in the first place was because I couldn't get the voice processing feature of AudioEngine to work reliably, which I needed to add to prevent the model from hearing itself. I see that you didn't use voice processing in any capacity, so wondering if this somehow isn't an issue w/ watchOS.

(I do have a watch kicking around, hopefully it still takes a charge so I can test this out :))

frameCapacity: AVAudioFrameCount(targetFormat.sampleRate / inputFormat.sampleRate * Double(buffer.frameLength))
) else { return }

// Convert the buffer to target sample rate
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are going to have some problems at runtime with this. Does the model sometimes seem confused when you are speaking? There may be pops in the audio. This is a whole rabbit hole of a thing, but it comes from trying to convert sample rates using AVAudioConverter. Much of the sample code on the internet is incorrect for converting between sample rates (if the sample rates are the same, then the task is much easier).

I have a method here that has been working well for us:
https://github.com/lzell/AIProxySwift/blob/main/Sources/AIProxy/MicrophonePCMSampleVendor.swift#L304

Can you reuse that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to be clear, my initial focus has just been to get this to compile successfully on watchOS (which it now does), and so next I want to move on to testing out the actual audio functionality. Since you're more familiar with the implementation, if you are set up to do that, then you're a couple steps ahead of me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at a high level, the way that I've sort of #ifdef'd the different platforms is ok — but the implementation within that is totally up for grabs. So if you've got working code that could be back ported into this structure, I think that's the way to go for sure.

@lzell
Copy link
Owner

lzell commented Apr 25, 2025

Wow it is seriously an effort to get Xcode to actually attach to an Apple watch, huh? I have been trying to test this out all morning 🫠

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants