GrokClientis primarily backed by generated gRPC protocol clients, but voice features use xAI's documented REST/WebSocket endpoints because there are no generated voice protocol types insrc\xAI.Protocol.- Voice REST calls use
GrokClient.HttpHandler(backed byhttpHandlerscache) — a plainSocketsHttpHandler+Polly pipeline separate from the gRPC channel.ChannelHandlerreturnsChannelBaseonly; there is no.Handlerproperty on it. AsITextToSpeechClientreturns anITextToSpeechClientimplementation that usesPOST /v1/ttsfor unary audio andwss://.../v1/ttsfor streaming audio.AsISpeechToTextClientreturns anISpeechToTextClientimplementation that usesPOST /v1/sttfor file transcription andwss://.../v1/sttfor raw-audio streaming transcription.- TTS defaults follow xAI docs: voice
eve, languageenwhen omitted byTextToSpeechOptions, and MP3 output when no codec is specified. - STT streaming defaults follow xAI docs: encoding
pcmand sample rate16000when omitted; WebSocket input must be raw encoded audio, not MP3/WAV container bytes.