What does Speech-To-Text and can I trust it?

I compare the RealtimeAPI with the browsers SpeechRecognition capabilities.

Does anyone else have the experience, that the RealTimeAPI STT is giving out babelfish?
In addition, though I set the language for RealtimeAPI to `"de"` it won't focus on recognizing the language in German. This is why we get interesting speakings in other languages.

Here are examples:

Actual: Asking for the weather in German: "Wie ist das Wetter in Berlin?"

> It takes quite a fight.

> That's not how we do it.

>  Pa pa, super spada

> 수고하셨습니다.

>  I don't know what I'm doing with my life.

I'm a bit worried about GPT.


To compare, the SpeechRecognition from the browser.

> denkst du, du bist im moment ne?

>  aber thomas anrufen?

> hallo.

Correct wake word

> auch ernsthaft.

Actual "Ach, ernsthaft!"

The browser's speech recognition is much closer to what is actually said. 
And I think the issue, why the LLM doesn't always understand one is because the STT creates a babelfish.

Has anyone else made the experience?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What does Speech-To-Text and can I trust it? #101

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What does Speech-To-Text and can I trust it? #101

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions