Skip to content

Replace model-not-found Service with Envoy direct_response EnvoyFilter#60

Merged
techworldhello merged 1 commit into
kaito-project:mainfrom
rambohe-ch:imporve-model-not-found
May 13, 2026
Merged

Replace model-not-found Service with Envoy direct_response EnvoyFilter#60
techworldhello merged 1 commit into
kaito-project:mainfrom
rambohe-ch:imporve-model-not-found

Conversation

@rambohe-ch
Copy link
Copy Markdown
Collaborator

@rambohe-ch rambohe-ch commented May 12, 2026

  • charts/modelharness: add envoyfilter-not-found.yaml that patches a catch-all 404 (OpenAI-compatible JSON) onto each per-namespace Gateway via Envoy direct_response.
  • charts/modelharness: delete httproute-not-found.yaml and referencegrant.yaml; remove modelNotFound block from values.yaml.
  • hack/e2e/scripts/install-components.sh: drop install_model_not_found function, its phase1-base invocation, and the unused MANIFESTS_DIR.
  • hack/e2e/scripts/validate-components.sh: drop the cluster-shared model-not-found Pod readiness check.
  • test/e2e/utils: remove ModelNotFoundNamespace/PodLabel constants and the unused HTTPRouteGVK/ReferenceGrantGVK; refresh setup.go and helm.go doc comments.
  • test/e2e/model_routing_test.go: rewrite 'Model-specific route wins over catch-all' to assert HTTP 200 + matching response body model (the old nginx access-log probe no longer applies); drop unused countNginxAccessLogs helper and kubernetes import.
  • test/e2e/gpu_mocker_test.go: refresh comments for the unknown-model 404 spec to describe the EnvoyFilter direct_response.
  • docs: update test/e2e/README.md and production-stack-E2E-test-scenarios.md to reflect the new design.

Reason for Change:

Requirements

  • added unit tests and e2e tests (if applicable).

Issue Fixed:

Notes for Reviewers:

- charts/modelharness: add envoyfilter-not-found.yaml that patches a
  catch-all 404 (OpenAI-compatible JSON) onto each per-namespace
  Gateway via Envoy direct_response.
- charts/modelharness: delete httproute-not-found.yaml and
  referencegrant.yaml; remove modelNotFound block from values.yaml.
- hack/e2e/scripts/install-components.sh: drop install_model_not_found
  function, its phase1-base invocation, and the unused MANIFESTS_DIR.
- hack/e2e/scripts/validate-components.sh: drop the cluster-shared
  model-not-found Pod readiness check.
- test/e2e/utils: remove ModelNotFoundNamespace/PodLabel constants and
  the unused HTTPRouteGVK/ReferenceGrantGVK; refresh setup.go and
  helm.go doc comments.
- test/e2e/model_routing_test.go: rewrite 'Model-specific route wins
  over catch-all' to assert HTTP 200 + matching response body model
  (the old nginx access-log probe no longer applies); drop unused
  countNginxAccessLogs helper and kubernetes import.
- test/e2e/gpu_mocker_test.go: refresh comments for the unknown-model
  404 spec to describe the EnvoyFilter direct_response.
- docs: update test/e2e/README.md and
  production-stack-E2E-test-scenarios.md to reflect the new design.
@rambohe-ch rambohe-ch force-pushed the imporve-model-not-found branch 2 times, most recently from a70a8d2 to 62afd8a Compare May 12, 2026 13:49
@techworldhello techworldhello merged commit 6c1b4d3 into kaito-project:main May 13, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants