See Mistral adapter:
def _response_to_dict(obj: ChatCompletionChunk | ChatCompletion) -> dict:
return obj.to_dict(warnings=False)
async def chat_completion(
*, request: dict, client: AsyncAzureOpenAI | AsyncOpenAI
) -> AsyncIterator[dict] | dict:
response: (
AsyncStream[ChatCompletionChunk] | ChatCompletion
) = await call_with_extra_body(client.chat.completions.create, request)
if isinstance(response, AsyncStream):
raw_stream = map_stream(_response_to_dict, response)
return extract_reasoning_content(raw_stream)
else:
return extract_reasoning_content(_response_to_dict(response))
Here, openai library does parsing for the response, and then we convert the models back to JSON.
This round trip could be avoided completely if we use client.chat.completions.with_raw_response.create instead, which doesn't do any parsing.
Likewise for other adapters that do not require response parsing.
See Mistral adapter:
Here,
openailibrary does parsing for the response, and then we convert the models back to JSON.This round trip could be avoided completely if we use
client.chat.completions.with_raw_response.createinstead, which doesn't do any parsing.Likewise for other adapters that do not require response parsing.