Description
We successfully integrated gpt-4o using azure-ai-openai
, and we manage to pass in a json schema as described in the docs.
We are now trying to migrate to azure-ai-inference, and are experiencing issues trying to constrain gpt-4o model response to a json schema.
Client is built like this:
private ChatCompletionsClient createClient(final Connection connection) {
return new ChatCompletionsClientBuilder()
.endpoint(connection.getEndpoint())
.credential(new AzureKeyCredential(connection.getKey()))
.buildClient();
}
The json schema is created like this (side note, it used to be a single instance of BinaryData
in azure-ai-openai
, whereas it is now a Map<String, BinaryData>
in azure-ai-inference
, the reason for this change and how the schema is supposed to be generated e.g. from a json string is not totally clear to me):
private Map<String, BinaryData> jsonSchemaFromString(String jsonSchemaString) {
final Map<String, BinaryData> jsonSchema = new LinkedHashMap<>();
JSONObject jsonObject = new JSONObject(jsonSchemaString);
for (final String key : jsonObject.keySet()) {
BinaryData binaryData = BinaryData.fromObject(jsonObject.get(key));
jsonSchema.put(key, binaryData);
}
return jsonSchema;
}
And chat completion options are set like this:
if (jsonSchemaString != null) {
ChatCompletionsResponseFormatJsonSchemaDefinition jsonSchemaDefinition = new ChatCompletionsResponseFormatJsonSchemaDefinition(
UUID.randomUUID().toString(),
jsonSchemaFromString(jsonSchemaString));
jsonSchemaDefinition.setStrict(false);
chatCompletionsOptions.setResponseFormat(new ChatCompletionsResponseFormatJsonSchema(jsonSchemaDefinition));
}
The model returns the following error
Status code 400, "{"error":{"code":"BadRequest","message":"response_format value as json_schema is enabled only for api versions 2024-08-01-preview and later"}}
If I try to create the client specifying the requested service version:
private ChatCompletionsClient createClient(final Connection connection) {
return new ChatCompletionsClientBuilder()
.endpoint(connection.getEndpoint())
.credential(new AzureKeyCredential(connection.getKey()))
.serviceVersion(ModelServiceVersion.V2024_08_01_PREVIEW)
.buildClient();
}
I get a different error:
Status code 404, "{"error":{"code":"404","message": "Resource not found"}}"
Note that the model responds correctly if I don't set the json schema response format.
Any hint/help/sample is appreciated.
Thanks