-
Notifications
You must be signed in to change notification settings - Fork 50
watchOS Compatibility for AIProxy #144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -66,7 +66,7 @@ open class AudioPCMPlayer { | |||
self.inputFormat = _inputFormat | |||
self.playableFormat = _playableFormat | |||
|
|||
#if !os(macOS) | |||
#if os(iOS) || os(visionOS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about tvOS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be implemented as:
#if os(iOS) || os(tvOS) || os(visionOS)
but at the risk of enumerating every platform, it could also be implemented as:
#if !os(macOS) && !os(watchOS)
Although 2 is harder to parse.
I haven't actually tested this on tvOS, so that would be the place to start.
Thanks for this effort! I only skimmed it quickly so far. First question: Does the model hear its own playback with this method? One of the reasons I dropped to AudioToolkit in the first place was because I couldn't get the voice processing feature of AudioEngine to work reliably, which I needed to add to prevent the model from hearing itself. I see that you didn't use voice processing in any capacity, so wondering if this somehow isn't an issue w/ watchOS. (I do have a watch kicking around, hopefully it still takes a charge so I can test this out :)) |
frameCapacity: AVAudioFrameCount(targetFormat.sampleRate / inputFormat.sampleRate * Double(buffer.frameLength)) | ||
) else { return } | ||
|
||
// Convert the buffer to target sample rate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are going to have some problems at runtime with this. Does the model sometimes seem confused when you are speaking? There may be pops in the audio. This is a whole rabbit hole of a thing, but it comes from trying to convert sample rates using AVAudioConverter. Much of the sample code on the internet is incorrect for converting between sample rates (if the sample rates are the same, then the task is much easier).
I have a method here that has been working well for us:
https://github.com/lzell/AIProxySwift/blob/main/Sources/AIProxy/MicrophonePCMSampleVendor.swift#L304
Can you reuse that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to be clear, my initial focus has just been to get this to compile successfully on watchOS (which it now does), and so next I want to move on to testing out the actual audio functionality. Since you're more familiar with the implementation, if you are set up to do that, then you're a couple steps ahead of me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at a high level, the way that I've sort of #ifdef'd the different platforms is ok — but the implementation within that is totally up for grabs. So if you've got working code that could be back ported into this structure, I think that's the way to go for sure.
Wow it is seriously an effort to get Xcode to actually attach to an Apple watch, huh? I have been trying to test this out all morning 🫠 |
watchOS Compatibility for AIProxy
Overview
This PR restores watchOS compatibility to AIProxy while maintaining full functionality on other platforms. The main focus was getting the library to compile and function on watchOS by addressing platform-specific limitations around audio capture and device information access.
Key Changes
1. Audio Session Configuration in
AudioPCMPlayer
.defaultToSpeaker
option (not available on watchOS)2. Device Information in
ClientLibErrorLogger
3. Microphone Implementation for watchOS
MicrophonePCMSampleVendor
for watchOSTesting
I've successfully integrated this fork into a watchOS app that communicates with OpenAI's API. The token protection functionality works correctly, and the microphone implementation compiles properly on watchOS, though audio capture functionality itself has not been tested.
Future Improvements
Motivation
As more developers use AIProxy for protecting API tokens in watchOS apps, these changes make it possible to use the library without complicated workarounds. This PR aims to expand the reach of AIProxy to watchOS while maintaining its security features.
I've made a first pass to get it compiling and working on watchOS, but welcome any feedback or requests for additional tests/changes before merging.