azure-ai-inference and json-schema-response with gpt-4o

We successfully integrated gpt-4o using `azure-ai-openai`, and we manage to pass in a json schema as described in the docs.

We are now trying to migrate to azure-ai-inference, and are experiencing issues trying to constrain gpt-4o model response to a json schema.

Client is built like this:

```java
private ChatCompletionsClient createClient(final Connection connection) {

	return new ChatCompletionsClientBuilder()
			.endpoint(connection.getEndpoint())
			.credential(new AzureKeyCredential(connection.getKey()))
			.buildClient();
}

```
The json schema is created like this (side note, it used to be a single instance of `BinaryData` in `azure-ai-openai`, whereas it is now a `Map<String, BinaryData>` in `azure-ai-inference`, the reason for this change and how the schema is supposed to be generated e.g. from a json string is not totally clear to me):

```java
private Map<String, BinaryData> jsonSchemaFromString(String jsonSchemaString) {
	final Map<String, BinaryData> jsonSchema = new LinkedHashMap<>();
	JSONObject jsonObject = new JSONObject(jsonSchemaString);

	for (final String key : jsonObject.keySet()) {
		BinaryData binaryData = BinaryData.fromObject(jsonObject.get(key));
		jsonSchema.put(key, binaryData);
	}
	return jsonSchema;
}
```

And chat completion options are set like this:
```java
if (jsonSchemaString != null) {
	ChatCompletionsResponseFormatJsonSchemaDefinition jsonSchemaDefinition = new ChatCompletionsResponseFormatJsonSchemaDefinition(
			UUID.randomUUID().toString(),
			jsonSchemaFromString(jsonSchemaString));

	jsonSchemaDefinition.setStrict(false);

	chatCompletionsOptions.setResponseFormat(new ChatCompletionsResponseFormatJsonSchema(jsonSchemaDefinition));
}
```

The model returns the following error
```
Status code 400, "{"error":{"code":"BadRequest","message":"response_format value as json_schema is enabled only for api versions 2024-08-01-preview and later"}}
```

If I try to create the client specifying the requested service version:

```java
private ChatCompletionsClient createClient(final Connection connection) {

	return new ChatCompletionsClientBuilder()
			.endpoint(connection.getEndpoint())
			.credential(new AzureKeyCredential(connection.getKey()))
			.serviceVersion(ModelServiceVersion.V2024_08_01_PREVIEW)
			.buildClient();
}
```

I get a different error:

```
Status code 404, "{"error":{"code":"404","message": "Resource not found"}}"
```

Note that the model responds correctly if I don't set the json schema response format.

Any hint/help/sample is appreciated.

Thanks


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

azure-ai-inference and json-schema-response with gpt-4o #45120

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

azure-ai-inference and json-schema-response with gpt-4o #45120

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions