Description
Description
When using the AIJsonUtilities.CreateJsonSchema
method with OpenAI's ChatResponseFormat.CreateJsonSchemaFormat
(with jsonSchemaIsStrict
set to true), complex objects that contain nested objects and lists are not handled correctly. The generated JSON schema creates internal references that are not defined at the top level, which violates the requirements for the OpenAI schema.
Reproduction Steps
- Define the following C# models with nesting and lists:
namespace ConsoleTests.Models; public class SimpleObject { public string? Name { get; set; } public List<SimpleItem>? Items { get; set; } } public class SimpleItem { public int Id { get; set; } public float? Value { get; set; } public SimpleItem SubItem { get; set; } }
- Call the
ChatResponseFormat.CreateJsonSchemaFormat<T>
method with the above model, ensuringjsonSchemaIsStrict
is set to true:public static ChatResponseFormat CreateJsonSchemaFormat<T>( string jsonSchemaFormatName, string? jsonSchemaFormatDescription = null, bool? jsonSchemaIsStrict = null) { var formatObjectType = typeof(T); var type = formatObjectType.IsGenericType && formatObjectType.GetGenericTypeDefinition() == typeof(Nullable<>) ? Nullable.GetUnderlyingType(formatObjectType)! : formatObjectType; var jsonSchema = AIJsonUtilities.CreateJsonSchema(type, jsonSchemaFormatDescription, serializerOptions: JsonSerializerOptions.Default, inferenceOptions: new AIJsonSchemaCreateOptions() { IncludeSchemaKeyword = false, IncludeTypeInEnumSchemas = true, DisallowAdditionalProperties = true, RequireAllProperties = true }).ToString(); return ChatResponseFormat.CreateJsonSchemaFormat( jsonSchemaFormatName, jsonSchema: BinaryData.FromString(jsonSchema), jsonSchemaFormatDescription: jsonSchemaFormatDescription, jsonSchemaIsStrict: jsonSchemaIsStrict ); }
- Notice the generated JSON schema contains nested
$ref
pointers, e.g.:{ "type": "object", "properties": { "Name": { "type": ["string", "null"] }, "Items": { "type": ["array", "null"], "items": { "type": "object", "properties": { "Id": { "type": "integer" }, "Value": { "type": ["number", "null"] }, "SubItem": { "type": "object", "properties": { "Id": { "type": "integer" }, "Value": { "type": ["number", "null"] }, "SubItem": { "$ref": "#/properties/Items/items/properties/SubItem" } }, "additionalProperties": false, "required": ["Id", "Value", "SubItem"] } }, "additionalProperties": false, "required": ["Id", "Value", "SubItem"] } } }, "additionalProperties": false, "required": ["Name", "Items"] }
- Use the generated schema with the OpenAI SDK's ChatCompletionOptions. The following error is thrown:
System.ClientModel.ClientResultException: 'HTTP 400 (invalid_request_error: ) Parameter: response_format Invalid schema for response_format 'simpleObject': In context=('properties', 'Items', 'type', '0', 'items', 'properties', 'SubItem', 'properties', 'SubItem'), reference can only point to definitions defined at the top level of the schema.'
Expected behavior
The output JSON schema should define complex objects, such as SimpleItem
, at the top level (using a $defs
section) so that all $ref
pointers are valid. For instance, the schema should be structured similar to:
{
"type": "object",
"$defs": {
"SimpleItem": {
"type": "object",
"properties": {
"Id": { "type": "integer" },
"Value": {
"anyOf": [
{ "type": "number" },
{ "type": "null" }
]
},
"SubItem": {
"$ref": "#/$defs/SimpleItem"
}
},
"additionalProperties": false,
"required": ["Id", "Value", "SubItem"]
}
},
"properties": {
"Name": {
"anyOf": [
{ "type": "string" },
{ "type": "null" }
]
},
"Items": {
"anyOf": [
{
"items": { "$ref": "#/$defs/SimpleItem" },
"type": "array"
},
{ "type": "null" }
]
}
},
"additionalProperties": false,
"required": ["Name", "Items"]
}
This format correctly places the definitions at the top level and adheres to the restrictions of the OpenAI SDK.
Actual behavior
The generated schema places nested $ref
references within the property hierarchy rather than in a top-level definitions section. This leads to an invalid schema error when used with the ChatCompletionOptions in the OpenAI SDK, which only allows $ref
pointers that reference definitions at the top level of the schema.
Regression?
I don't believe this has ever worked.
Known Workarounds
The workaround for this is challenging to implement, requiring developers to override the TransformSchemaNode
to resolve the invalid JSON schema.
Configuration
.NET SDK:
Version: 9.0.200
Commit: 90e8b202f2
Workload version: 9.0.200-manifests.a3a1a094
MSBuild version: 17.13.8+cbc39bea8
Runtime Environment:
OS Name: Windows
OS Version: 10.0.26100
OS Platform: Windows
RID: win-x64
Base Path: C:\Program Files\dotnet\sdk\9.0.200\
I don't believe this is OS or .NET version specific.
Other information
No response