Skip to content

[Feature] 🌈 BingAPI 实现 whisper-1 模型接口 #397

@Harry-zklcdc

Description

@Harry-zklcdc

Describe the problem related to the feature request

逆向实现 whisper-1 模型的STT接口

Describe the solution you'd like

请求数据包

  1. Bing Copilot STT 接口: wss://sr.bing.com/opaluqu/speech/recognition/dictation/cognitiveservices/v1
  2. 请求数据包结构:
Path: speech.config
X-RequestId: <UUID>
X-Timestamp: <Timestamp>
Content-Type: application/json

{"context":{"system":{"name":"SpeechSDK","version":"1.15.0-alpha.0.1","build":"JavaScript","lang":"JavaScript"},"os":{"platform":"Browser/MacIntel","name":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36","version":"5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"},"audio":{"source":{"bitspersample":16,"channelcount":1,"connectivity":"Unknown","manufacturer":"Speech SDK","model":"默认 - MacBook Pro麦克风 (Built-in)","samplerate":16000,"type":"Microphones"}}},"recognition":"conversation"}
Path: speech.context
X-RequestId: <UUID>
X-Timestamp: <Timestamp>
Content-Type: application/json

{}

binary

.~Path: audio
<0d0a>
X-RequestId: <UUID>
<0d0a>
X-Timestamp: <Timestamp>
<0d0a>
Content-Type: audio/x-wav
<0d0a>
RIFF
<0000 0000>
WAVEfmt 
<1000 0000 0100 0100 80>
>
<0000 00>
}
<0000 0200 1000>
data
<0000 0000>

binary, 一个上传包大小 3296B

audio..X-RequestId: <UUID>
<0d0a>
X-Timestamp: <Timestamp>
<0d0a>
+ wavBinaryData

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions