Skip to content

azure-ai-inference and json-schema-response with gpt-4o #45120

Open
@lucabaroffio

Description

@lucabaroffio

We successfully integrated gpt-4o using azure-ai-openai, and we manage to pass in a json schema as described in the docs.

We are now trying to migrate to azure-ai-inference, and are experiencing issues trying to constrain gpt-4o model response to a json schema.

Client is built like this:

private ChatCompletionsClient createClient(final Connection connection) {

	return new ChatCompletionsClientBuilder()
			.endpoint(connection.getEndpoint())
			.credential(new AzureKeyCredential(connection.getKey()))
			.buildClient();
}

The json schema is created like this (side note, it used to be a single instance of BinaryData in azure-ai-openai, whereas it is now a Map<String, BinaryData> in azure-ai-inference, the reason for this change and how the schema is supposed to be generated e.g. from a json string is not totally clear to me):

private Map<String, BinaryData> jsonSchemaFromString(String jsonSchemaString) {
	final Map<String, BinaryData> jsonSchema = new LinkedHashMap<>();
	JSONObject jsonObject = new JSONObject(jsonSchemaString);

	for (final String key : jsonObject.keySet()) {
		BinaryData binaryData = BinaryData.fromObject(jsonObject.get(key));
		jsonSchema.put(key, binaryData);
	}
	return jsonSchema;
}

And chat completion options are set like this:

if (jsonSchemaString != null) {
	ChatCompletionsResponseFormatJsonSchemaDefinition jsonSchemaDefinition = new ChatCompletionsResponseFormatJsonSchemaDefinition(
			UUID.randomUUID().toString(),
			jsonSchemaFromString(jsonSchemaString));

	jsonSchemaDefinition.setStrict(false);

	chatCompletionsOptions.setResponseFormat(new ChatCompletionsResponseFormatJsonSchema(jsonSchemaDefinition));
}

The model returns the following error

Status code 400, "{"error":{"code":"BadRequest","message":"response_format value as json_schema is enabled only for api versions 2024-08-01-preview and later"}}

If I try to create the client specifying the requested service version:

private ChatCompletionsClient createClient(final Connection connection) {

	return new ChatCompletionsClientBuilder()
			.endpoint(connection.getEndpoint())
			.credential(new AzureKeyCredential(connection.getKey()))
			.serviceVersion(ModelServiceVersion.V2024_08_01_PREVIEW)
			.buildClient();
}

I get a different error:

Status code 404, "{"error":{"code":"404","message": "Resource not found"}}"

Note that the model responds correctly if I don't set the json schema response format.

Any hint/help/sample is appreciated.

Thanks

Metadata

Metadata

Assignees

Labels

AI Model Inferencecustomer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions