vllm-project
diff --git a/‎docs/agent/architecture-guardrails.md‎
Lines changed: 14 additions & 0 deletions b/‎docs/agent/architecture-guardrails.md‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎docs/agent/tech-debt/README.md‎
Lines changed: 2 additions & 0 deletions b/‎docs/agent/tech-debt/README.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/agent/tech-debt/td-037-custom-chat-completions-structs.md‎
Lines changed: 56 additions & 0 deletions b/‎docs/agent/tech-debt/td-037-custom-chat-completions-structs.md‎
Lines changed: 56 additions & 0 deletions
diff --git a/‎src/semantic-router/pkg/anthropic/compat_test.go‎
Lines changed: 279 additions & 0 deletions b/‎src/semantic-router/pkg/anthropic/compat_test.go‎
Lines changed: 279 additions & 0 deletions
@@ -56,6 +56,20 @@
   - `plugin` performs post-decision or post-selection processing
   - `global` carries intentionally cross-cutting behavior
 
+## API Type Contracts
+
+- Use official SDK types for OpenAI and Anthropic request/response handling:
+  - `github.com/openai/openai-go` for OpenAI-shaped data
+  - `github.com/anthropics/anthropic-sdk-go` for Anthropic-shaped data
+- Do not define custom structs that duplicate what the SDK provides
+  (e.g., custom `ChatCompletionRequest`, `ChatCompletionResponse`)
+- Exceptions: packages that intentionally avoid the SDK dependency for
+  isolation (e.g., E2E fixtures, standalone training tools) should document
+  the reason in a comment and keep their custom types minimal
+- When the SDK type does not cover a field the router needs, extend via
+  composition (`type ExtendedReq struct { openai.ChatCompletionNewParams; Extra string }`)
+  rather than reimplementing the whole struct
+
 ## Avoid
 
 - giant managers
 
@@ -90,6 +90,7 @@ Keep the numeric index unique within `docs/agent/tech-debt/`.
 - [TD034 Runtime and Dashboard State Surfaces Still Lack a Coherent Durability, Recovery, and Telemetry Contract](td-034-runtime-and-dashboard-state-durability-and-telemetry-contract.md)
 - [TD035 Projection Partition Default Coverage Contract Is No Longer Declarative Only](td-035-signal-group-default-coverage-contract-gap.md)
 - [TD036 Decision Tree Authoring Cannot Round-Trip Through Runtime Config](td-036-decision-tree-authoring-roundtrip-gap.md)
+- [TD037 Custom Chat Completions Structs Duplicate Official OpenAI SDK Types](td-037-custom-chat-completions-structs.md)
 
 ## Architecture Review Coverage Map
 
@@ -101,6 +102,7 @@ Use this map when turning scale-out architecture findings into debt work. Reuse
   - extproc request and response phase collapse: [TD023](td-023-extproc-request-pipeline-phase-collapse.md), [TD029](td-029-extproc-response-pipeline-phase-collapse.md)
   - restart-sensitive runtime state and control-plane telemetry semantics: [TD034](td-034-runtime-and-dashboard-state-durability-and-telemetry-contract.md)
   - remaining hotspot-ratchet debt across router and binding hotspots: [TD006](td-006-structural-rule-target-vs-legacy-hotspots.md)
+  - remaining custom Chat Completions struct consolidation: [TD037](td-037-custom-chat-completions-structs.md)
 - Dashboard frontend and backend
   - frontend route shell, editor control plane, and large UI containers: [TD030](td-030-dashboard-frontend-config-and-interaction-slice-collapse.md)
   - dashboard backend training, evaluation, and model-research contract seams: [TD032](td-032-training-evaluation-artifact-contract-drift.md)
 
@@ -0,0 +1,56 @@
+# TD037: Custom Chat Completions Structs Duplicate Official OpenAI SDK Types
+
+## Status
+
+Partially resolved — major packages migrated, remaining surfaces tracked below.
+
+## Scope
+
+`src/semantic-router/pkg/responseapi`, `pkg/classification`, `pkg/modelselection`,
+`pkg/mcp`, `pkg/extproc`, `pkg/anthropic`, `pkg/cache`, `pkg/memory`
+
+## Summary
+
+Several packages defined their own `ChatCompletionRequest`, `ChatCompletionResponse`,
+`ChatMessage`, `Choice`, `Usage`, and similar types instead of using the official
+`openai-go` SDK types. This duplication risks schema drift against the upstream
+OpenAI API and creates maintenance burden when fields are added or changed.
+
+## Evidence
+
+Packages that **have been migrated** (this PR):
+
+| Package | Removed Structs | Now Uses |
+|---------|----------------|----------|
+| `pkg/responseapi` | 8 structs (ChatCompletionRequest, ChatMessage, ToolCall, FunctionCall, ChatTool, ChatCompletionResponse, Choice, CompletionUsage) | `openai.ChatCompletionNewParams`, `openai.ChatCompletion` |
+| `pkg/classification` | 6 structs (ChatCompletionRequest, ChatMessage, ChatCompletionResponse, Choice, Message, Usage) | `openai.ChatCompletion`, `openai.ChatCompletionMessageParamUnion` with composition for `ExtraBody` |
+| `pkg/modelselection` | 3 structs (ChatCompletionRequest, ChatMessage, ChatCompletionResponse) | `openai.ChatCompletion`, `openai.ChatCompletionMessageParamUnion` with composition for error fields |
+| `pkg/mcp` | 4 structs (OpenAITool, OpenAIToolFunction, OpenAIToolCall, OpenAIToolCallFunction) | `openai.ChatCompletionToolParam`, `openai.ChatCompletionMessageToolCall` |
+
+Packages with **remaining custom types** (lower priority):
+
+| Package | Custom Types | Notes |
+|---------|-------------|-------|
+| `pkg/cache` | Message, Usage structs | Used for cache serialization; migration deferred to avoid cache format break |
+| `pkg/memory` | Message struct | Used for memory store serialization |
+
+## Why It Matters
+
+- Schema drift: custom structs miss new fields added to the OpenAI API
+- Maintenance burden: changes must be replicated across multiple struct definitions
+- Testing gap: custom types can silently diverge from what clients actually send
+- PR #1070 reviewer explicitly flagged this as needed work
+
+## Desired End State
+
+All OpenAI-shaped request/response handling uses `openai-go` SDK types. Custom
+structs exist only where composition is needed for router-specific extensions
+(e.g., vLLM `extra_body`, provider error wrapping), documented with a comment
+explaining why the extension is necessary.
+
+## Exit Criteria
+
+- [ ] `pkg/cache` serialization types migrated or documented as intentional exceptions
+- [ ] `pkg/memory` serialization types migrated or documented as intentional exceptions
+- [ ] Zero custom `ChatCompletion*` type definitions remain outside documented exceptions
+- [ ] Compatibility tests cover all conversion paths
@@ -0,0 +1,279 @@
+package anthropic
+
+import (
+	"encoding/json"
+	"testing"
+
+	"github.com/anthropics/anthropic-sdk-go"
+	"github.com/openai/openai-go"
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+// These tests verify that the OpenAI ↔ Anthropic conversion layer produces
+// JSON output that is wire-compatible with the respective official SDKs.
+// They serve as a regression guard against schema drift when either SDK is
+// upgraded or when internal conversion logic is refactored.
+
+func TestRoundTrip_SimpleRequest(t *testing.T) {
+	orig := &openai.ChatCompletionNewParams{
+		Model: "claude-sonnet-4-5",
+		Messages: []openai.ChatCompletionMessageParamUnion{
+			{OfUser: &openai.ChatCompletionUserMessageParam{
+				Content: openai.ChatCompletionUserMessageParamContentUnion{
+					OfString: openai.String("Explain photosynthesis."),
+				},
+			}},
+		},
+		Temperature: openai.Float(0.5),
+	}
+
+	anthropicBody, err := ToAnthropicRequestBody(orig)
+	require.NoError(t, err)
+
+	var parsed anthropic.MessageNewParams
+	require.NoError(t, json.Unmarshal(anthropicBody, &parsed))
+
+	assert.Equal(t, anthropic.Model("claude-sonnet-4-5"), parsed.Model)
+	assert.Equal(t, DefaultMaxTokens, parsed.MaxTokens)
+	require.Len(t, parsed.Messages, 1)
+	assert.Equal(t, anthropic.MessageParamRoleUser, parsed.Messages[0].Role)
+}
+
+func TestRoundTrip_SystemSeparation(t *testing.T) {
+	orig := &openai.ChatCompletionNewParams{
+		Model: "claude-sonnet-4-5",
+		Messages: []openai.ChatCompletionMessageParamUnion{
+			{OfSystem: &openai.ChatCompletionSystemMessageParam{
+				Content: openai.ChatCompletionSystemMessageParamContentUnion{
+					OfString: openai.String("Be concise."),
+				},
+			}},
+			{OfUser: &openai.ChatCompletionUserMessageParam{
+				Content: openai.ChatCompletionUserMessageParamContentUnion{
+					OfString: openai.String("Hi"),
+				},
+			}},
+		},
+	}
+
+	body, err := ToAnthropicRequestBody(orig)
+	require.NoError(t, err)
+
+	var parsed anthropic.MessageNewParams
+	require.NoError(t, json.Unmarshal(body, &parsed))
+
+	require.Len(t, parsed.System, 1)
+	assert.Equal(t, "Be concise.", parsed.System[0].Text)
+	require.Len(t, parsed.Messages, 1, "system must not appear in messages array")
+}
+
+func TestRoundTrip_MaxTokensVariants(t *testing.T) {
+	tests := []struct {
+		name     string
+		req      *openai.ChatCompletionNewParams
+		expected int64
+	}{
+		{
+			name: "MaxCompletionTokens takes priority",
+			req: &openai.ChatCompletionNewParams{
+				Model:               "claude-sonnet-4-5",
+				MaxCompletionTokens: openai.Int(512),
+				MaxTokens:           openai.Int(1024),
+				Messages:            simpleUserMsg("hi"),
+			},
+			expected: 512,
+		},
+		{
+			name: "fallback to MaxTokens",
+			req: &openai.ChatCompletionNewParams{
+				Model:     "claude-sonnet-4-5",
+				MaxTokens: openai.Int(2048),
+				Messages:  simpleUserMsg("hi"),
+			},
+			expected: 2048,
+		},
+		{
+			name: "default when neither set",
+			req: &openai.ChatCompletionNewParams{
+				Model:    "claude-sonnet-4-5",
+				Messages: simpleUserMsg("hi"),
+			},
+			expected: DefaultMaxTokens,
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			body, err := ToAnthropicRequestBody(tt.req)
+			require.NoError(t, err)
+
+			var parsed anthropic.MessageNewParams
+			require.NoError(t, json.Unmarshal(body, &parsed))
+			assert.Equal(t, tt.expected, parsed.MaxTokens)
+		})
+	}
+}
+
+func TestResponse_StopReasonMapping(t *testing.T) {
+	tests := []struct {
+		anthropicReason anthropic.StopReason
+		expectedOpenAI  string
+	}{
+		{anthropic.StopReasonEndTurn, "stop"},
+		{anthropic.StopReasonMaxTokens, "length"},
+		{anthropic.StopReasonToolUse, "tool_calls"},
+		{"unknown_future_reason", "stop"},
+	}
+
+	for _, tt := range tests {
+		t.Run(string(tt.anthropicReason), func(t *testing.T) {
+			resp := anthropic.Message{
+				ID:         "msg_test",
+				Content:    []anthropic.ContentBlockUnion{{Type: "text", Text: "ok"}},
+				StopReason: tt.anthropicReason,
+				Usage:      anthropic.Usage{InputTokens: 1, OutputTokens: 1},
+			}
+
+			raw, err := json.Marshal(resp)
+			require.NoError(t, err)
+
+			out, err := ToOpenAIResponseBody(raw, "claude-sonnet-4-5")
+			require.NoError(t, err)
+
+			var oai openai.ChatCompletion
+			require.NoError(t, json.Unmarshal(out, &oai))
+			assert.Equal(t, tt.expectedOpenAI, oai.Choices[0].FinishReason)
+		})
+	}
+}
+
+func TestResponse_UsageMapping(t *testing.T) {
+	resp := anthropic.Message{
+		ID:         "msg_usage",
+		Content:    []anthropic.ContentBlockUnion{{Type: "text", Text: "hi"}},
+		StopReason: anthropic.StopReasonEndTurn,
+		Usage:      anthropic.Usage{InputTokens: 42, OutputTokens: 17},
+	}
+
+	raw, err := json.Marshal(resp)
+	require.NoError(t, err)
+
+	out, err := ToOpenAIResponseBody(raw, "test-model")
+	require.NoError(t, err)
+
+	var oai openai.ChatCompletion
+	require.NoError(t, json.Unmarshal(out, &oai))
+
+	assert.Equal(t, int64(42), oai.Usage.PromptTokens)
+	assert.Equal(t, int64(17), oai.Usage.CompletionTokens)
+	assert.Equal(t, int64(59), oai.Usage.TotalTokens)
+}
+
+func TestResponse_OutputIsValidOpenAIJSON(t *testing.T) {
+	resp := anthropic.Message{
+		ID:   "msg_valid",
+		Role: "assistant",
+		Content: []anthropic.ContentBlockUnion{
+			{Type: "text", Text: "Hello world"},
+		},
+		StopReason: anthropic.StopReasonEndTurn,
+		Usage:      anthropic.Usage{InputTokens: 5, OutputTokens: 3},
+	}
+	raw, err := json.Marshal(resp)
+	require.NoError(t, err)
+
+	out, err := ToOpenAIResponseBody(raw, "claude-sonnet-4-5")
+	require.NoError(t, err)
+
+	var oai openai.ChatCompletion
+	require.NoError(t, json.Unmarshal(out, &oai),
+		"output must unmarshal cleanly into the official openai.ChatCompletion type")
+
+	assert.Equal(t, "chat.completion", string(oai.Object))
+	assert.Equal(t, "msg_valid", oai.ID)
+	assert.Equal(t, "claude-sonnet-4-5", oai.Model)
+	require.Len(t, oai.Choices, 1)
+	assert.Equal(t, "assistant", string(oai.Choices[0].Message.Role))
+	assert.Equal(t, "Hello world", oai.Choices[0].Message.Content)
+	assert.NotZero(t, oai.Created)
+}
+
+func TestRequest_EmptyMessagesRejected(t *testing.T) {
+	req := &openai.ChatCompletionNewParams{
+		Model:    "claude-sonnet-4-5",
+		Messages: nil,
+	}
+
+	body, err := ToAnthropicRequestBody(req)
+	require.NoError(t, err)
+
+	var parsed anthropic.MessageNewParams
+	require.NoError(t, json.Unmarshal(body, &parsed))
+	assert.Empty(t, parsed.Messages)
+}
+
+func TestResponse_EmptyContentBlocks(t *testing.T) {
+	resp := anthropic.Message{
+		ID:         "msg_empty",
+		Content:    []anthropic.ContentBlockUnion{},
+		StopReason: anthropic.StopReasonEndTurn,
+		Usage:      anthropic.Usage{InputTokens: 1, OutputTokens: 0},
+	}
+	raw, err := json.Marshal(resp)
+	require.NoError(t, err)
+
+	out, err := ToOpenAIResponseBody(raw, "model")
+	require.NoError(t, err)
+
+	var oai openai.ChatCompletion
+	require.NoError(t, json.Unmarshal(out, &oai))
+	assert.Empty(t, oai.Choices[0].Message.Content)
+}
+
+func TestRequest_StopSequences(t *testing.T) {
+	t.Run("string array", func(t *testing.T) {
+		req := &openai.ChatCompletionNewParams{
+			Model:    "claude-sonnet-4-5",
+			Messages: simpleUserMsg("hi"),
+			Stop: openai.ChatCompletionNewParamsStopUnion{
+				OfStringArray: []string{"END", "DONE"},
+			},
+		}
+
+		body, err := ToAnthropicRequestBody(req)
+		require.NoError(t, err)
+
+		var parsed anthropic.MessageNewParams
+		require.NoError(t, json.Unmarshal(body, &parsed))
+		assert.Equal(t, []string{"END", "DONE"}, parsed.StopSequences)
+	})
+
+	t.Run("single string", func(t *testing.T) {
+		req := &openai.ChatCompletionNewParams{
+			Model:    "claude-sonnet-4-5",
+			Messages: simpleUserMsg("hi"),
+			Stop: openai.ChatCompletionNewParamsStopUnion{
+				OfString: openai.String("STOP"),
+			},
+		}
+
+		body, err := ToAnthropicRequestBody(req)
+		require.NoError(t, err)
+
+		var parsed anthropic.MessageNewParams
+		require.NoError(t, json.Unmarshal(body, &parsed))
+		assert.Equal(t, []string{"STOP"}, parsed.StopSequences)
+	})
+}
+
+// simpleUserMsg builds a minimal message slice for tests.
+func simpleUserMsg(text string) []openai.ChatCompletionMessageParamUnion {
+	return []openai.ChatCompletionMessageParamUnion{
+		{OfUser: &openai.ChatCompletionUserMessageParam{
+			Content: openai.ChatCompletionUserMessageParamContentUnion{
+				OfString: openai.String(text),
+			},
+		}},
+	}
+}