[Feat][Router] Migrate custom Chat Completions structs to official SDK types#1550
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
7be4cfc to
f665e2a
Compare
|
Sorry about this. #1553 introduced a large recent refactor, which caused conflicts here. Please rebase onto main and resolve them. |
91226a2 to
a0abe7f
Compare
a0abe7f to
ef2b6ba
Compare
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
There was a problem hiding this comment.
Pull request overview
This PR migrates OpenAI Chat Completions request/response handling away from internal ad-hoc JSON structs to the official github.com/openai/openai-go SDK types, and adds compatibility tests/guardrails to prevent future schema drift across the router’s OpenAI ↔ Anthropic interoperability surfaces.
Changes:
- Replace Response API / MCP / classification / modelselection OpenAI-shaped structs with
openai-goSDK types (plus minimal composition where router-specific fields are needed). - Update extproc Response API translation to parse upstream responses into
openai.ChatCompletion. - Add wire-format/round-trip tests and document the SDK-first “API Type Contracts” guardrail; track remaining deferred migrations as tech debt.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/agent/structure-rules.yaml | Marks modelselection benchmark runner as a legacy hotspot with relaxed function checks. |
| tools/agent/repo-manifest.yaml | Registers TD037 in the manifest and doc governance block. |
| src/semantic-router/pkg/responseapi/translator.go | Replaces custom Chat Completions structs with openai.ChatCompletionNewParams / openai.ChatCompletion and adds multimodal/tool-choice mapping. |
| src/semantic-router/pkg/responseapi/translator_test.go | Adds unit coverage for translation behavior, tool choice mapping, and multimodal content handling. |
| src/semantic-router/pkg/responseapi/roundtrip_test.go | Adds mock-backend and wire-format compliance tests for request/response round-trips. |
| src/semantic-router/pkg/modelselection/benchmark_runner.go | Uses SDK types for request messages and improves provider error detection via an error envelope. |
| src/semantic-router/pkg/mcp/types.go | Migrates MCP ↔ OpenAI tool conversion to SDK tool/tool-call types. |
| src/semantic-router/pkg/mcp/types_test.go | Adds tests ensuring MCP conversion is SDK-wire-compatible and robust to bad arguments. |
| src/semantic-router/pkg/extproc/req_filter_response_api.go | Parses Chat Completions responses into openai.ChatCompletion before translating to Response API. |
| src/semantic-router/pkg/extproc/openai_compat_test.go | Adds regression tests ensuring internal parsing/serialization remains compatible with SDK types. |
| src/semantic-router/pkg/classification/vllm_client.go | Switches classifier vLLM client to SDK response type and SDK message unions in requests. |
| src/semantic-router/pkg/anthropic/compat_test.go | Adds OpenAI ↔ Anthropic conversion compatibility tests validated against official SDK types. |
| docs/agent/tech-debt/td-037-custom-chat-completions-structs.md | Introduces TD037 to track remaining custom Chat Completions type debt (e.g., cache/memory serialization). |
| docs/agent/tech-debt/README.md | Links TD037 in the index and coverage map. |
| docs/agent/architecture-guardrails.md | Adds “API Type Contracts” guardrail enforcing SDK-first type usage. |
| if len(parts) == 1 && (parts[0].Type == ContentTypeInputText || parts[0].Type == ContentTypeOutputText) { | ||
| return parts[0].Text | ||
| func convertFunctionTool(fn *FunctionDef) openai.ChatCompletionToolParam { | ||
| param := openai.ChatCompletionToolParam{ |
There was a problem hiding this comment.
openai.ChatCompletionToolParam needs its type set (typically "function"). convertFunctionTool currently constructs a tool param with only Function populated, which can serialize with an empty/missing type and cause OpenAI-compatible backends to reject/ignore the tool. Set Type: "function" when building the SDK tool param.
| param := openai.ChatCompletionToolParam{ | |
| param := openai.ChatCompletionToolParam{ | |
| Type: openai.ChatCompletionToolType("function"), |
There was a problem hiding this comment.
This is a false positive.
The openai-go SDK's ChatCompletionToolParam.Type field is constant.Function (a ~string type with a custom MarshalJSON). The SDK explicitly documents: "This field can be elided, and will marshal its zero value as 'function'." The SDK's own tests construct ChatCompletionToolParam without setting Type. The zero value serializes correctly to {"type":"function",...}. I've added a wire-format assertion in types_test.go to verify this behavior.
| Function: OpenAIToolFunction{ | ||
| // ConvertToolToOpenAI converts an MCP tool to the official OpenAI SDK tool param type. | ||
| func ConvertToolToOpenAI(tool mcp.Tool) openai.ChatCompletionToolParam { | ||
| param := openai.ChatCompletionToolParam{ |
There was a problem hiding this comment.
ConvertToolToOpenAI returns an openai.ChatCompletionToolParam without setting Type. Elsewhere in the repo, function tools always set Type: "function"; leaving it empty produces invalid/unsupported tool JSON for OpenAI-compatible servers. Initialize the returned tool param with Type: "function".
| param := openai.ChatCompletionToolParam{ | |
| param := openai.ChatCompletionToolParam{ | |
| Type: "function", |
There was a problem hiding this comment.
same as previous comment's reply
|
|
||
| ## Status | ||
|
|
||
| Partially resolved — major packages migrated, remaining surfaces tracked below. |
There was a problem hiding this comment.
should the tech dept be closed fully?
There was a problem hiding this comment.
The remaining items (pkg/cache/ChatMessage and pkg/memory/Message) involve serialized formats used for cache entries. Migrating them to SDK types would be a breaking change: existing cached entries would become unreadable.
This is better handled as a follow-up PR with its own rollout plan, just like suggested in the 'Follow-Up" section in the Description above.
There was a problem hiding this comment.
cool, can you track the follow-up with an issue?
ef2b6ba to
d0428d4
Compare
✅ Supply Chain Security Report — All Clear
Scanned at |
| if req.Stream { | ||
| completionReq.StreamOptions = openai.ChatCompletionStreamOptionsParam{ | ||
| IncludeUsage: openai.Bool(true), | ||
| } | ||
| } |
There was a problem hiding this comment.
TranslateToCompletionRequest no longer sets the actual stream request flag; it only sets StreamOptions. With openai-go, most backends will not stream unless stream itself is enabled, so this likely breaks streaming behavior compared to the previous completionReq.Stream = req.Stream. Set the SDK stream flag (per the SDK’s field type) in addition to StreamOptions.
There was a problem hiding this comment.
Good catch — fixed. ChatCompletionNewParams does not have a Stream field (it uses request options internally). We now inject stream:true into the translated JSON via sjson.SetBytes in req_filter_response_api.go. Two tests added.
| return openai.ChatCompletionToolChoiceOptionUnionParam{ | ||
| OfAuto: openai.String(tc), |
There was a problem hiding this comment.
The string tool_choice mapping always populates OfAuto, even for "none" and "required". If the SDK marshaler interprets OfAuto as the literal "auto" variant (ignoring the string value), this will emit an incorrect wire value and change tool-selection semantics. Map known strings to their corresponding union variants (e.g., auto/none/required), and treat unknown strings as empty/invalid.
| return openai.ChatCompletionToolChoiceOptionUnionParam{ | |
| OfAuto: openai.String(tc), | |
| switch tc { | |
| case "auto": | |
| return openai.ChatCompletionToolChoiceOptionUnionParam{ | |
| OfAuto: openai.String(tc), | |
| } | |
| case "none": | |
| return openai.ChatCompletionToolChoiceOptionUnionParam{ | |
| OfNone: openai.String(tc), | |
| } | |
| case "required": | |
| return openai.ChatCompletionToolChoiceOptionUnionParam{ | |
| OfRequired: openai.String(tc), | |
| } | |
| default: | |
| // Unknown string values are treated as invalid/empty. | |
| return openai.ChatCompletionToolChoiceOptionUnionParam{} |
There was a problem hiding this comment.
False positive. In openai-go v1.12.0, ChatCompletionToolChoiceOptionUnionParam only has OfAuto (param.Opt[string]) and OfChatCompletionNamedToolChoice — there is no OfNone or OfRequired (the suggested code would not compile). OfAuto is a generic string wrapper; OfAuto with value "none" marshals to "none". Verified by TestConvertToolChoice_StringNone and TestConvertToolChoice_StringRequired.
| func convertFunctionTool(fn *FunctionDef) openai.ChatCompletionToolParam { | ||
| param := openai.ChatCompletionToolParam{ | ||
| Function: openai.FunctionDefinitionParam{ | ||
| Name: fn.Name, | ||
| Description: openai.String(fn.Description), | ||
| }, | ||
| } |
There was a problem hiding this comment.
openai.ChatCompletionToolParam typically requires the "type": "function" discriminator in the JSON payload for OpenAI-compatible backends. This constructor doesn’t set the tool type explicitly, which can cause the marshaled request to omit type and be rejected. Set the SDK tool type field to the function discriminator when building the param.
There was a problem hiding this comment.
False positive. ChatCompletionToolParam.Type is constant.Function, whose MarshalJSON calls marshalString which substitutes Default() -> "function" for the zero value. The library's own tests do not set this field. Wire-format verified by TestConvertToolToOpenAI_RoundTrip_JSON.
| func ConvertToolToOpenAI(tool mcp.Tool) openai.ChatCompletionToolParam { | ||
| param := openai.ChatCompletionToolParam{ | ||
| Function: openai.FunctionDefinitionParam{ | ||
| Name: tool.Name, | ||
| Description: tool.Description, | ||
| Description: openai.String(tool.Description), | ||
| }, | ||
| } |
There was a problem hiding this comment.
Similar to the responseapi translator, this conversion doesn’t set the tool type discriminator (usually "function"). Even though tests currently expect "type": "function", relying on implicit SDK behavior is brittle and can break when the SDK changes. Set the tool type explicitly to ensure stable wire-format output.
There was a problem hiding this comment.
Same as above — constant.Function auto-marshals its zero value to "function". Verified by TestConvertToolToOpenAI_RoundTrip_JSON which asserts parsed["type"] == "function".
| } | ||
|
|
||
| return msg, nil | ||
| return messageForRole(role, ""), nil |
There was a problem hiding this comment.
For non-history inputs, malformed content JSON now gets silently translated into an empty message (no error). This can cause hard-to-debug data loss (the request “succeeds” but with blank user content). Consider returning an error when current-request InputItem.Content fails to parse as either a string or content-parts array; keep the “warn-and-skip” behavior only for history items where best-effort recovery is intended.
| return messageForRole(role, ""), nil | |
| return nil, fmt.Errorf("responseapi: failed to parse InputItem.Content as string or content-parts JSON") |
There was a problem hiding this comment.
Pre-existing behavior, not a regression. The PR preserved existing semantics. Improving error handling for malformed content is a valid follow-up but outside the scope of this type-migration PR.
| usage := &Usage{ | ||
| InputTokens: int(resp.Usage.PromptTokens), | ||
| OutputTokens: int(resp.Usage.CompletionTokens), | ||
| TotalTokens: int(resp.Usage.TotalTokens), |
There was a problem hiding this comment.
This always returns a non-nil Usage even when the upstream response omitted the usage field (SDK struct would stay zero-valued). If the Response API previously used nil to mean “usage not provided,” this is a behavior change and may be misleading operationally (looks like real 0-token usage). Consider only setting Usage when at least one token count is non-zero, or detect presence of the usage key during unmarshal (e.g., via a lightweight envelope struct / map[string]json.RawMessage) to preserve the distinction.
| usage := &Usage{ | |
| InputTokens: int(resp.Usage.PromptTokens), | |
| OutputTokens: int(resp.Usage.CompletionTokens), | |
| TotalTokens: int(resp.Usage.TotalTokens), | |
| var usage *Usage | |
| if resp.Usage.PromptTokens != 0 || resp.Usage.CompletionTokens != 0 || resp.Usage.TotalTokens != 0 { | |
| usage = &Usage{ | |
| InputTokens: int(resp.Usage.PromptTokens), | |
| OutputTokens: int(resp.Usage.CompletionTokens), | |
| TotalTokens: int(resp.Usage.TotalTokens), | |
| } |
There was a problem hiding this comment.
Consistent with the library — ChatCompletion.Usage is a struct (not pointer), always populated. The old custom type had the same behavior. Changing this would be a behavioral change outside the scope of this migration.
4d32675 to
4a2ca13
Compare
4a2ca13 to
d838634
Compare
|
can you fix the CI? others LGTM |
Replaces custom Chat Completions structs with the upstream Go SDK types across the ext_proc pipeline, Response API translator, MCP bridge, and model selection benchmark runner. Ensures wire-format compatibility with upstream protocol specifications. Key changes: - Use SDK ChatCompletionNewParams / ChatCompletion types throughout - Inject stream flag via sjson since the SDK manages it internally - Convert tool_choice using SDK union types - Add wire-format round-trip tests for protocol compatibility - Extract helper functions to reduce cyclomatic/cognitive complexity Closes: vllm-project#1517 Signed-off-by: Alon Baluom <alon.baluom@gmail.com> Signed-off-by: asaadbalum <asaad.balum@gmail.com>
f1dd67d to
012d804
Compare
Fixed 3 CI violation, the last CI failure is from pre-existing golangci-lint violations on main, exposed because the new lint pipeline flags all issues in touched files, not just new ones. PR #1701 adds the exclusion rules. Once merged, updating this branch should make the check pass. |

Summary
Stabilizes OpenAI/Anthropic protocol compatibility by replacing 21 internal ad-hoc JSON structs with official
openai-goSDK types across 4 packages, as requested in #1517.openai-goSDK types. Handle multimodal content (text + image) via SDK union types. Preserve ToolChoice forwarding through SDK type mapping. Add warnings for silently dropped image content and malformed history items.extra_bodyextension.req_filter_response_api.goto consumeopenai.ChatCompletioninstead of custom type.pkg/cachemigration as tech debt.Testing
Follow-up
pkg/cache/ChatMessageuses a serialized format for Redis cache entries. Migrating it to SDK types would be a breaking change requiring cache format migration (existing cache entries would become unreadable). This should be handled as a separate PR with its own rollout plan.Test plan
go buildcompiles all changed packagesgo test -racepasses for all 6 affected packages (responseapi, anthropic, mcp, extproc, classification, modelselection)golangci-lintreports 0 issues in changed files (--new-from-rev)go vet ./...zero issuesgo mod tidycleanmake agent-validate)Closes #1517
Signed-off-by: asaadbalum asaad.balum@gmail.com