Weird compressed audio while reproducing Chat Completion's AudioOutput generated AudioClip. Am I missing something? #358
-
Hi! Thank you for this super dense package. Today I was testing Chat Completion Audio features with GPT 4o mini audio. Once I play the returned Here I leave you the main snippets that manage the chat request. public async void Respond()
{
var message = inputTextField.text;
onRequestReceived?.Invoke(message);
var (answer, audioClip) = await _languageProcessor.RespondTo("User", message);
onRequestCompleted?.Invoke(answer);
answerTextField.text = answer;
audioSource.clip = audioClip;
audioSource.Play();
}
public async Task<(string, AudioClip)> RespondTo(string speakerName, string message, AudioFormat audioFormat = AudioFormat.Pcm16)
{
Conversation.AppendMessage(new Message(Role.User, message, speakerName));
var chatRequest = new ChatRequest(Conversation.Messages, model: Model, audioConfig: new AudioConfig(Voice, audioFormat));
var response = await PerformChatCompletion(chatRequest);
return (response, response.AudioOutput.AudioClip);
}
protected virtual async Task<Message> PerformChatCompletion(ChatRequest chatRequest)
{
if (!Authenticated)
{
Log($"[{GetType().Name}] LanguageProcessor not ready");
return new Message(Role.Assistant, "");
}
Log($"[{GetType().Name}] Responding...");
var start = DateTime.Now;
var response = await Api.ChatEndpoint.GetCompletionAsync(chatRequest);
var latency = DateTime.Now.Subtract(start).TotalMilliseconds;
var choice = response.FirstChoice.Message;
Log($"[{GetType().Name}] Request latency: {latency:0.0}ms | Finish reason: {response.FirstChoice.FinishReason}\nResponse: {choice}");
// Add new response to the history
Conversation.AppendMessage(choice);
return choice;
} As you can see I left the audio format to default PCM (I tried using MP3 but the result was only noise with warnings in the console). I can tell you that on startup my Am I missing something? I tried to take a look at your two samples, but one is for Realtime models, and the Chat one uses SpeechGeneration endpoint to generate audio. Bonus Question: is it possible to manage audio response in ChatCompletion endpoint even using |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Thanks for the kind words!
Yes I made a deliberate decision to only support PCM, for a number of reasons. Mainly unity doesn't support MP3 streaming on all build platform targets. Working with the Audio system in unity can be quite challenging. I ended up writing my own
Yes, but I don't believe that is what I'm doing by default in my Chat demo scene.
yes, but currently only with models that support speech. |
Beta Was this translation helpful? Give feedback.
@HunterProduction should be fixed in