Skip to content

Allow custom HTTP headers on the OpenAI embedding client#5704

Open
gabrielcosi wants to merge 1 commit into
stacklok:mainfrom
gabrielcosi:optimizer-embedding-headers
Open

Allow custom HTTP headers on the OpenAI embedding client#5704
gabrielcosi wants to merge 1 commit into
stacklok:mainfrom
gabrielcosi:optimizer-embedding-headers

Conversation

@gabrielcosi

Copy link
Copy Markdown
Contributor

Summary

Follow-up to #5633. The gateways this embedding path targets often key
behavior off a request header (request routing, scoping a response cache),
but the OpenAI client only sent Content-Type and Authorization, so there
was no way to add one.

  • Add an optional optimizer.embeddingHeaders map that the OpenAI client
    sends with every embedding request. An absent field keeps today's
    behavior, and the TEI path is untouched.
  • Reject bad configs at admission: CEL requires the openai provider and
    valid, non-reserved RFC 7230 header names; OpenAPI constraints on the
    value type bound length and forbid control characters.
  • The same checks run in GetAndValidateConfig (using the shared httpval
    helpers), so non-Kubernetes config sources fail at load instead of at
    request time.
  • The client clones the header map at construction and sets Content-Type
    and Authorization last, so custom headers can never override them.

Closes #5700

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)

Also ran the operator envtest suite (task operator-test-integration):
admission rejects reserved names, non-token names, empty/control-character/
overlong values, and the tei-plus-headers combination, and accepts uncommon
but valid token names (x-key'name`x), matching the runtime check.

API Compatibility

  • This PR does not break the v1beta1 API (adds one optional field to
    OptimizerConfig).

Does this introduce a user-facing change?

Yes. spec.config.optimizer.embeddingHeaders sends extra HTTP headers with
every request the optimizer makes to an OpenAI-compatible embedding service.
The field is optional and existing configurations are unchanged.

Special notes for reviewers

  • Value constraints live in the OpenAPI schema (minLength/maxLength/
    pattern on the new EmbeddingHeaderValue type) rather than CEL: a CEL
    rule iterating unbounded value strings exceeded the apiserver's cost
    budget by a factor of 6, which envtest catches at CRD install. Schema
    validation has no cost budget.
  • The CEL name regex uses RE2 hex escapes (\x27, \x60) for the ' and
    ` token characters, since quote characters can't appear literally
    inside the marker string.
  • Authorization is rejected on purpose: the API key stays in
    OPENAI_API_KEY (see Add OpenAI-compatible embedding client to vmcp optimizer #5633), and header values are stored in plain text
    in the spec and generated ConfigMap.

Assisted by Claude Code. Design, code,
and tests were directed, reviewed, and manually verified by the author.

Gateways in front of OpenAI-compatible embedding services often key
behavior (routing, response-cache scoping) off a request header, but
the optimizer's OpenAI client only sent Content-Type and Authorization.

Add an optional spec.config.optimizer.embeddingHeaders map applied to
every embedding request. Admission rejects non-openai providers,
reserved names (Authorization, Content-Type), and invalid RFC 7230
header names via CEL, and bounds values through OpenAPI schema
constraints; config load mirrors the checks for non-Kubernetes config
sources. The client sets its protocol headers last so custom headers
can never override them.

Closes stacklok#5700
@github-actions github-actions Bot added the size/M Medium PR: 300-599 lines changed label Jul 1, 2026
@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.62%. Comparing base (2aca563) to head (1268b82).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5704   +/-   ##
=======================================
  Coverage   70.61%   70.62%           
=======================================
  Files         667      667           
  Lines       67607    67638   +31     
=======================================
+ Hits        47743    47767   +24     
- Misses      16410    16413    +3     
- Partials     3454     3458    +4     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Medium PR: 300-599 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

vmcp optimizer: allow custom HTTP headers on the OpenAI embedding client

1 participant