Add support for realtime API

**Is your feature request related to a problem? Please describe.**


OpenAI just extended their API with realtime support with web sockets
 https://openai.com/index/introducing-the-realtime-api/?s=09

**Describe the solution you'd like**

LocalAI should support backends with voice capabilities and introduce a compatible API endpoint with OpenAI clients.

Ideally it should support also function calling as OpenAI does:

> Under the hood, the Realtime API lets you create a persistent WebSocket connection to exchange messages with GPT-4o. The API supports [function calling(opens in a new window)](https://platform.openai.com/docs/guides/function-calling), which makes it possible for voice assistants to respond to user requests by triggering actions or pulling in new context. For example, a voice assistant could place an order on behalf of the user or retrieve relevant customer information to personalize its responses.

Seems that also Chat completion API is gonna have audio output/input too, but API specs are not available yet:

> Audio in the Chat Completions API will be released in the coming weeks, as a new model `gpt-4o-audio-preview`. With `gpt-4o-audio-preview`, developers can input text or audio into GPT-4o and receive responses in text, audio, or both.

**Describe alternatives you've considered**


**Additional context**

https://github.com/mudler/LocalAI/issues/3602
https://github.com/mudler/LocalAI/pull/3722

API docs: https://platform.openai.com/docs/guides/realtime https://platform.openai.com/docs/api-reference/realtime-client-events/session-update

https://github.com/tmc/grpc-websocket-proxy
https://github.com/openconfig/grpctunnel

https://github.com/mudler/LocalAI/tree/feat/realtime

open source models that can handle realtime speech:

- https://github.com/homebrewltd/ichigo
- https://github.com/ictnlp/LLaMA-Omni
- https://github.com/kyutai-labs/moshi
- https://speech.fish.audio/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add support for realtime API #3714

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Add support for realtime API #3714

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions