composer.dictation exits recording UI on Safari before transcription

## Description

`composer.dictation` is not usable in Safari for our custom-backend ChatKit integration. The microphone button renders in the composer and Safari requests microphone permission. After permission is granted, ChatKit briefly enters the recording UI, then the stop button/waveform disappear after about one second and the input returns to text mode. No usable transcription is produced.

This is different from #179, which covered hosted-backend transcription support. This integration uses a custom backend and implements `input.transcribe`.

## Expected behavior

After tapping the microphone button, the composer should remain in dictation mode with the waveform and stop button visible until the user stops recording or the max duration is reached. The recorded audio should then be sent to `backend.input.transcribe` / `input.transcribe`.

## Actual behavior

On Safari, the UI appears to enter dictation mode briefly, then resets back to text mode. From the user perspective, the waveform/stop UI disappears almost immediately and dictation is non-functional.

## Environment

- `@openai/chatkit-react`: `1.5.1`
- Transitive `@openai/chatkit`: `1.7.0`
- ChatKit web component script: `https://cdn.platform.openai.com/deployments/chatkit/chatkit.js`
- Backend mode: custom backend using `api.url` + `domainKey`, not hosted Agent Builder backend
- Dictation enabled with:

```ts
composer: {
  dictation: { enabled: true }
}
```

## Backend support

The custom backend implements `input.transcribe` and accepts the audio payload shape:

```ts
{
  type: "input.transcribe",
  params: {
    audio_base64: string,
    mime_type: string
  }
}
```

The backend accepts common MediaRecorder/Safari MIME types, including `audio/webm`, `audio/ogg`, `audio/mp4`, `audio/m4a`, and `audio/wav`, then sends the file to OpenAI transcription.

## Bundle investigation

The current CDN bundle appears to hardcode dictation MIME selection roughly as:

```js
["audio/webm;codecs=opus", "audio/mp4", "audio/ogg;codecs=opus"]
  .find(MediaRecorder.isTypeSupported)
```

The recorder path also uses:

- `navigator.mediaDevices.getUserMedia({ audio: true })`
- `new AudioContext()` for waveform analysis
- `new MediaRecorder(stream, { mimeType, audioBitsPerSecond: 24000 })`

There does not appear to be a public `composer.dictation.mimeType` or Safari-specific override in the typed ChatKit options.

## Why this matters

Host apps cannot preserve the built-in ChatKit composer dictation UI while working around this externally, because the recorder/waveform UI runs inside ChatKit's iframe. Replacing it with a host-page mic button changes the product UI and loses the built-in waveform/stop behavior.

## Requested help

Can ChatKit either:

- handle Safari's MediaRecorder/AudioContext behavior more defensively;
- expose a `composer.dictation.mimeType` or MIME preference override;
- emit a structured `chatkit.error`/`chatkit.log` event with the underlying recorder failure;
- or document Safari/iOS support limitations for `composer.dictation`?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

composer.dictation exits recording UI on Safari before transcription #198

Description

Expected behavior

Actual behavior

Environment

Backend support

Bundle investigation

Why this matters

Requested help

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

composer.dictation exits recording UI on Safari before transcription #198

Description

Description

Expected behavior

Actual behavior

Environment

Backend support

Bundle investigation

Why this matters

Requested help

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions