Summary
Reconciler.getIntelligentRoute (src/semantic-router/pkg/k8s/reconciler.go:222-237) returns an error when more than one IntelligentRoute exists in the watched namespace. The reconcile loop's caller in reconcile() short-circuits on this error and never reaches the validation code path that calls updateRouteStatus. The result: when a second IntelligentRoute is applied alongside a valid one, neither CR gets a status update for as long as both exist.
This was surfaced while attempting to write an end-to-end test that exercises the embedding-modality validator landed in #1895 against a real cluster. The unit test in src/semantic-router/pkg/k8s/reconciler_embedding_modality_test.go (also from #1895) hand-feeds the validator a single CR at a time via controller-runtime's fake client, so it passes. On a live cluster, the constraint blocks the same shape of test.
Reproduction
- Apply a valid
IntelligentRoute to namespace default. It reconciles to Ready=True.
- Apply a second
IntelligentRoute to the same namespace. (The test fixture used a queryModality: audio rule, but the constraint is independent of CR contents - any second CR triggers it.)
- Observe the reconciler's
watchLoop. Every 5 seconds, reconcile() returns the same error.
Observed log output
reconciler.go:159 "Reconciliation check: failed to get IntelligentRoute: multiple IntelligentRoutes found in namespace default, expected exactly 1"
That message repeats indefinitely while both CRs exist. The reconciler does not attempt to validate either route; it does not call updateRouteStatus; both CRs sit with empty status: {}.
Why the e2e test gets stuck
The embedding-signal-modality-validation testcase in the draft of #1881 polls for Ready=False, Reason=ValidationFailed on the bad CR within a 60-second window. The reconciler can't reach the validator while the good CR is also present, so the bad CR's status remains empty for the full polling window and the test times out.
Receipts from a kind run on 2026-05-14:
Test FAIL: embedding-signal-modality-validation
timed out (1m0s) waiting for Ready=False+Reason=ValidationFailed on intelligentroute/default/bad-audio-route
last observed status="" reason=""
Suggested direction
Make Reconciler reconcile each IntelligentRoute in the namespace independently rather than asserting exactly-one. Concretely:
getIntelligentRoute becomes listIntelligentRoutes (or similar) returning the full slice.
- The reconcile loop iterates the slice and calls
validateAndUpdate per CR, so each gets its own Ready/ValidationFailed status.
- The "exactly one" assumption appears to be load-bearing only inside
validateAndUpdate where a single canonical config is emitted. That step would need a decision about what to do when multiple valid CRs disagree: merge, last-wins, deny-on-conflict, or only-one-active-at-a-time-with-an-explicit-selector. None of those decisions are made here - that's the design conversation this issue exists to start.
IntelligentPool is in the same shape today (line 216 has the equivalent "expected exactly 1" check). The same lift would apply if/when multi-pool composition becomes a real use case.
What this unblocks
Once per-CR reconcile is in, the embedding-signal-modality-validation e2e testcase in the draft branch multimodal-routing-e2e-readd can land cleanly. The shape is in e2e/testcases/embedding_signal_modality_validation.go and the bad fixture is in e2e/profiles/multimodal-routing/crds/intelligentroute-bad-audio.yaml on that branch (currently being dropped from #1881 ahead of merge for the reason above).
The validator coverage today is the controller-runtime fake-client unit test in src/semantic-router/pkg/k8s/reconciler_embedding_modality_test.go, which exercises all six validator branches. That coverage is sufficient until live-cluster validation becomes feasible.
Related
Summary
Reconciler.getIntelligentRoute(src/semantic-router/pkg/k8s/reconciler.go:222-237) returns an error when more than oneIntelligentRouteexists in the watched namespace. The reconcile loop's caller inreconcile()short-circuits on this error and never reaches the validation code path that callsupdateRouteStatus. The result: when a secondIntelligentRouteis applied alongside a valid one, neither CR gets a status update for as long as both exist.This was surfaced while attempting to write an end-to-end test that exercises the embedding-modality validator landed in #1895 against a real cluster. The unit test in
src/semantic-router/pkg/k8s/reconciler_embedding_modality_test.go(also from #1895) hand-feeds the validator a single CR at a time via controller-runtime's fake client, so it passes. On a live cluster, the constraint blocks the same shape of test.Reproduction
IntelligentRouteto namespacedefault. It reconciles toReady=True.IntelligentRouteto the same namespace. (The test fixture used aqueryModality: audiorule, but the constraint is independent of CR contents - any second CR triggers it.)watchLoop. Every 5 seconds,reconcile()returns the same error.Observed log output
That message repeats indefinitely while both CRs exist. The reconciler does not attempt to validate either route; it does not call
updateRouteStatus; both CRs sit with emptystatus: {}.Why the e2e test gets stuck
The
embedding-signal-modality-validationtestcase in the draft of #1881 polls forReady=False, Reason=ValidationFailedon the bad CR within a 60-second window. The reconciler can't reach the validator while the good CR is also present, so the bad CR's status remains empty for the full polling window and the test times out.Receipts from a kind run on 2026-05-14:
Suggested direction
Make
Reconcilerreconcile eachIntelligentRoutein the namespace independently rather than asserting exactly-one. Concretely:getIntelligentRoutebecomeslistIntelligentRoutes(or similar) returning the full slice.validateAndUpdateper CR, so each gets its ownReady/ValidationFailedstatus.validateAndUpdatewhere a single canonical config is emitted. That step would need a decision about what to do when multiple valid CRs disagree: merge, last-wins, deny-on-conflict, or only-one-active-at-a-time-with-an-explicit-selector. None of those decisions are made here - that's the design conversation this issue exists to start.IntelligentPoolis in the same shape today (line 216 has the equivalent "expected exactly 1" check). The same lift would apply if/when multi-pool composition becomes a real use case.What this unblocks
Once per-CR reconcile is in, the
embedding-signal-modality-validatione2e testcase in the draft branchmultimodal-routing-e2e-readdcan land cleanly. The shape is ine2e/testcases/embedding_signal_modality_validation.goand the bad fixture is ine2e/profiles/multimodal-routing/crds/intelligentroute-bad-audio.yamlon that branch (currently being dropped from #1881 ahead of merge for the reason above).The validator coverage today is the controller-runtime fake-client unit test in
src/semantic-router/pkg/k8s/reconciler_embedding_modality_test.go, which exercises all six validator branches. That coverage is sufficient until live-cluster validation becomes feasible.Related
docs/agent/tech-debt/td-040-reconcile-path-skips-family-validators.md(broader tech-debt write-up)