Skip to content

feat: move OpenShift Route and CRB management into kserve behind distro build tags#1404

Open
VedantMahabaleshwarkar wants to merge 5 commits intoopendatahub-io:masterfrom
VedantMahabaleshwarkar:rhoaieng-56729
Open

feat: move OpenShift Route and CRB management into kserve behind distro build tags#1404
VedantMahabaleshwarkar wants to merge 5 commits intoopendatahub-io:masterfrom
VedantMahabaleshwarkar:rhoaieng-56729

Conversation

@VedantMahabaleshwarkar
Copy link
Copy Markdown

@VedantMahabaleshwarkar VedantMahabaleshwarkar commented Apr 13, 2026

What this PR does / why we need it:

Moves OpenShift Route creation and auth-delegation ClusterRoleBinding management for raw InferenceServices from odh-model-controller into kserve behind //go:build distro tags using three hook pairs (_default.go / _ocp.go):

  1. Factory hook — selects RawOCPRouteReconciler instead of RawIngressReconciler in distro builds (when Gateway API is disabled)
  2. Controller setup hook — registers Route API scheme and Owns(&routev1.Route{}) watch
  3. Platform permissions hook — manages system:auth-delegator ClusterRoleBindings for auth-enabled ISVCs

Reverts shared files (kube_ingress_reconciler.go, utils.go) to match upstream kserve/master, eliminating ODH-specific modifications that caused rebase friction.

Adapts rawkube_controller_test.go for the distro pattern by splitting platform-specific ingress assertions into build-tagged test helper files. Moves ODH-only auth tests to the distro test file. All shared tests use higher-level helpers that are no-ops in distro mode for cluster-local ISVCs (which don't create Routes).

Which issue(s) this PR fixes:
Ref: RHOAIENG-56270

Feature/Issue validation/testing:

  • 10 unit tests for RawOCPRouteReconciler (exposed/cluster-local, auth/no-auth, stopped ISVC, label removal, transformer routing, timeout, route update, not-admitted requeue)
  • 8 unit tests for reconcilePlatformPermissions (CRB create/update/delete lifecycle, SA variants, deleting-ISVC exclusion)
  • 2 integration tests for ODH auth flow (full controller flow with SAR configMap, oauth-proxy sidecar, Route admission simulation)
  • All 113 non-distro tests pass (make test GOTAGS="")
  • All distro tests pass (make test GOTAGS="distro")
  • Both build modes compile cleanly

make test GOTAGS=""
make test GOTAGS="distro"

Special notes for your reviewer:

  • The distro integration test stubs (assertPlatformIngressExists, etc.) in rawkube_platform_ocp_test.go are currently no-ops for shared tests that create cluster-local ISVCs (which don't produce Routes in distro mode). Full Route assertion coverage for non-auth exposed ISVCs is planned as follow-up work.
  • CRB permissions (clusterrolebindings CRUD) are only in the distro overlay RBAC, not in the upstream config/rbac/role.yaml.

Companion PR (odh-model-controller):

The corresponding removal of the now-redundant KserveRawRouteReconciler and KserveRawClusterRoleBindingReconciler from odh-model-controller is in opendatahub-io/odh-model-controller#803. Both PRs must ship in the same ODH release to avoid either:

  • Dual-controller conflicts (if kserve merges first without the odh-mc removal): infinite reconciliation loop from Route label fights (isvc.Labels vs {"inferenceservice-name": isvc.Name})
  • No controller managing Routes/CRBs (if odh-mc merges first without this PR): existing resources persist but new ISVCs won't get Routes/CRBs

Upgrade safety: Existing Routes (owned by the ISVC via SetControllerReference) and CRBs (cluster-scoped, identical naming <ns>-<sa>-auth-delegator) survive the transition and are adopted by the new kserve distro reconciler.

Checklist:

  • Have you added unit/e2e tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?
  • Have you linked the JIRA issue(s) to this PR?

Release note:

Move OpenShift Route creation and auth-delegation ClusterRoleBinding management for raw InferenceServices into kserve behind distro build tags, eliminating cross-controller race conditions with odh-model-controller and reducing upstream rebase friction.

Summary by CodeRabbit

  • New Features

    • OpenShift Route support to expose InferenceServices with TLS and auth-aware routing.
    • Automatic creation and management of auth-delegation ClusterRoleBindings and an accompanying distro RBAC role manifest.
  • Behavior Changes

    • Ingress handling now distinguishes exposed vs. cluster-local modes and updates status URLs/addresses accordingly.
  • Tests

    • Added platform-specific tests covering Route behavior, auth flows, and CRB reconciliation.
  • Documentation

    • New plan/status doc detailing Route & CRB distro work and test progress.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 13, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: VedantMahabaleshwarkar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [VedantMahabaleshwarkar]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds distro-gated OpenShift support for InferenceService: new RawOCPRouteReconciler (distro) to manage Route lifecycle (exposed vs cluster-local, auth-driven TLS/port, admission handling) and a distro resolver to select that reconciler. Introduces build-tagged hooks to register Route scheme and Owns Routes, and platform-permissions reconciliation that creates/updates/deletes system:auth-delegator ClusterRoleBindings per InferenceService auth annotation. Removes shared Route URL utility, shifts ingress URL generation to ingress-config-driven logic, adds kubebuilder RBAC markers and a generated ClusterRole for distro, and introduces many distro/non-distro test helpers and Makefile RBAC generation changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: moving OpenShift Route and CRB management into kserve behind distro build tags, which is the core objective of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@VedantMahabaleshwarkar VedantMahabaleshwarkar marked this pull request as ready for review April 14, 2026 15:03
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (9)
config/overlays/odh/rbac/inferenceservice/role.yaml (1)

6-18: Broad clusterrolebindings permissions warrant documentation.

Full CRUD on clusterrolebindings without resourceNames restriction grants broad cluster-wide RBAC manipulation capability. While this is necessary for dynamic CRB management per-ISVC (auth-delegator pattern), consider:

  1. Documenting the security rationale in the generated role or an accompanying ADR
  2. The controller SA with these permissions becomes a high-value target; ensure it runs with minimal attack surface

No bind or escalate verbs present, so direct privilege escalation to arbitrary roles isn't possible, but a compromised controller could still manipulate existing CRBs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/overlays/odh/rbac/inferenceservice/role.yaml` around lines 6 - 18, The
role grants full CRUD on clusterrolebindings (rules -> resources:
clusterrolebindings with verbs: create, delete, get, list, patch, update,
watch), which is broad and should be documented and risk-mitigated; update the
role manifest by adding a clear security rationale (e.g., as an annotation on
the Role/ClusterRole or a linked ADR) explaining why dynamic CRB management is
required for the inferenceservice controller, and harden the controller service
account by reducing scope where possible (restrict verbs and/or add
resourceNames if you can target specific CRBs, ensure the controller runs with
minimal RBAC/Node capabilities, and note these mitigations in the same
annotation or ADR).
pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go (2)

199-405: Large inline expected Deployment spec reduces test maintainability.

This 200+ line struct literal is difficult to review and update. Consider extracting a helper function or using a fixture file. Changes to deployment defaults will require updating this block.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go` around
lines 199 - 405, The test embeds a huge inline Deployment struct in
expectedDeployment which harms maintainability; extract that literal into a
helper (e.g., makeExpectedDeployment or loadExpectedDeploymentFixture) that
accepts the dynamic inputs used in the test (predictorDeploymentKey, serviceKey,
isvc, replicas, defaultResource, gracePeriod, revisionHistory,
progressDeadlineSeconds) and returns *appsv1.Deployment, then replace the inline
expectedDeployment with a call to that helper or load from a small YAML/JSON
fixture; ensure the helper constructs the same fields referenced in assertions
(metadata name/namespace, pod template labels/annotations, containers including
InferenceServiceContainerName and KubeRbacContainerName, volumes, probes,
resources, and Strategy) so future changes to defaults can be updated in one
place.

104-105: Inconsistent context usage: context.TODO() vs context.Background().

The test defines ctx := context.Background() at line 145, but uses context.TODO() at lines 104, 139, 193, 417, 461, 502, 526, 562, 617, 636, 656. Pick one for consistency—ctx is already scoped correctly.

Also applies to: 139-140, 193-194

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go` around
lines 104 - 105, Replace all uses of context.TODO() in this test with the
existing ctx variable (context.Background()) to ensure consistent context usage;
for example change k8sClient.Create(context.TODO(), configMap) and
k8sClient.Delete(context.TODO(), configMap) as well as other k8sClient calls
that pass context.TODO() to use ctx instead, and update any helper calls within
the test that currently pass context.TODO() to accept ctx so the same context is
used throughout.
pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp_test.go (2)

36-41: Scheme registration errors silently ignored (same pattern as route reconciler tests).

Handle or panic on scheme errors
 func permTestScheme() *runtime.Scheme {
 	s := runtime.NewScheme()
-	_ = v1beta1.AddToScheme(s)
-	_ = rbacv1.AddToScheme(s)
+	if err := v1beta1.AddToScheme(s); err != nil {
+		panic(fmt.Sprintf("failed to add v1beta1 to scheme: %v", err))
+	}
+	if err := rbacv1.AddToScheme(s); err != nil {
+		panic(fmt.Sprintf("failed to add rbacv1 to scheme: %v", err))
+	}
 	return s
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp_test.go`
around lines 36 - 41, permTestScheme currently ignores errors from
runtime.NewScheme() registrations; update it to handle or fail fast on
registration errors by checking the returned errors from v1beta1.AddToScheme(s)
and rbacv1.AddToScheme(s) inside permTestScheme(), and either return the error
to the caller or panic/log.Fatalf with a clear message including the failing
function name (e.g., v1beta1.AddToScheme, rbacv1.AddToScheme) so scheme
registration failures are not silently suppressed.

108-134: Missing test for RoleRef change scenario.

TestReconcilePlatformPermissions_AuthEnabled_CRBDiffers only tests subject drift. The production code at platform_permissions_ocp.go:99-107 has a distinct delete+recreate path when RoleRef differs. Add a test case where the existing CRB has a different RoleRef to cover that branch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp_test.go`
around lines 108 - 134, Add a new case to
TestReconcilePlatformPermissions_AuthEnabled_CRBDiffers that simulates a CRB
with a differing RoleRef: create existingCRB with the same Name
("ns1-default-auth-delegator") but a RoleRef.Name that is NOT
"system:auth-delegator" (and any Subjects), call reconcilePlatformPermissions,
then fetch the CRB and assert the RoleRef was replaced to the expected RoleRef
(APIGroup "rbac.authorization.k8s.io", Kind "ClusterRole", Name
"system:auth-delegator") and that Subjects/other expected fields were updated;
this will exercise the delete+recreate path in platform_permissions_ocp.go
(around the RoleRef diff logic).
pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go (2)

161-173: Subject comparison is order-sensitive; may cause unnecessary updates.

equalCRB compares subjects positionally. If an external actor reorders the Subjects slice (same content, different order), this returns false and triggers an update. While currently only one subject is added, this could cause reconciliation churn if the pattern changes.

Consider sorting or using a set-based comparison
 func equalCRB(a, b *rbacv1.ClusterRoleBinding) bool {
 	if a.RoleRef != b.RoleRef {
 		return false
 	}
 	if len(a.Subjects) != len(b.Subjects) {
 		return false
 	}
+	// For single-subject case this is fine, but for future-proofing:
 	for i := range a.Subjects {
 		if a.Subjects[i] != b.Subjects[i] {
 			return false
 		}
 	}
 	return true
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go` around
lines 161 - 173, equalCRB currently does an order-sensitive, positional
comparison of ClusterRoleBinding Subjects which can trigger needless updates if
the slice is reordered; modify equalCRB to compare Subjects in an
order-insensitive way by either sorting the Subjects before comparison or
building a set/map keyed by stable identity (e.g., Subject.Kind +
Subject.APIGroup + Subject.Namespace + Subject.Name) and comparing entries,
while still comparing RoleRef as before; update equalCRB to use that
set-based/sorted comparison so identical subject lists in different orders are
treated as equal.

99-107: Delete+Create race window for RoleRef changes.

Between Delete and Create, the CRB doesn't exist. If the controller restarts or another reconciliation runs concurrently, the auth-delegator permission is temporarily missing. Consider using a finalizer or ensuring idempotency through naming.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go` around
lines 99 - 107, The current logic deletes and recreates the ClusterRoleBinding
when existing.RoleRef != desired.RoleRef which creates a race window (crbName,
cl.Delete, cl.Create, existing.RoleRef, desired.RoleRef); instead, modify the
reconcile to perform an in-place update (fetch existing, deep-copy, set
existing.RoleRef = desired.RoleRef and call cl.Update or use a server-side
apply/patch) wrapped in a retry-on-conflict loop (e.g., RetryOnConflict) so
concurrent reconciles safely reconcile RoleRef without removing the binding;
handle update errors and fall back to recreate only if update consistently fails
after retries.
pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go (1)

53-54: Unused parameter _ *v1beta1.InferenceService suggests potential signature drift.

The OCP counterpart at rawkube_platform_ocp_test.go:54 also has this unused parameter. If neither build tag variant uses it, consider removing it from both to keep the interface minimal. If the OCP variant needs it in the future, the signature divergence will cause build failures in the other mode.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go`
around lines 53 - 54, The test helper assertPlatformIngressSpec currently
declares an unused parameter `_ *v1beta1.InferenceService`; remove that unused
parameter from the function signature of assertPlatformIngressSpec and from its
counterpart in the OCP variant so both build-tag variants stay in sync, then
update any call sites to stop passing the extra argument (or alternatively, if
the InferenceService will be needed, use the parameter inside
assertPlatformIngressSpec in both variants); reference the symbol
assertPlatformIngressSpec and the unused parameter `_ *v1beta1.InferenceService`
when making the change.
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler_test.go (1)

173-199: Test TestRawOCPRoute_ExposedWithAuth pre-creates route with wrong TLS termination.

Line 180 sets route.Spec.TLS.Termination = routev1.TLSTerminationReencrypt, but admittedRoute helper already sets TLSTerminationEdge at line 126. The test doesn't verify reconciler corrects the TLS termination—it only checks ISVC status. Consider asserting the route spec is updated to match auth-enabled expectations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler_test.go`
around lines 173 - 199, The test TestRawOCPRoute_ExposedWithAuth pre-sets the
route TLS termination to routev1.TLSTerminationReencrypt even though
admittedRoute already sets TLSTerminationEdge, and the test never asserts that
the reconciler updates the Route resource to the expected TLS when auth is
enabled; update the test to (1) ensure the initial route state is clear (use
admittedRoute as-is or explicitly set TLS to the scenario you want), (2) after
calling newOCPReconciler(...).Reconcile(...), fetch or inspect the
reconciler-updated Route object and assert route.Spec.TLS.Termination equals the
expected value for auth-enabled flows, and (3) keep the existing ISVC status
assertions (isvc.Status.URL and isvc.Status.Address checks) while adding the new
assertion referencing admittedRoute, TestRawOCPRoute_ExposedWithAuth,
route.Spec.TLS.Termination, newOCPReconciler, and constants.OauthProxyPort so
the test verifies the Route is corrected by the reconciler.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@charts/kserve-resources/files/kserve/resources.yaml`:
- Around line 234-245: The kserve-manager-role in files/kserve/resources.yaml
grants unconstrained CRUD on clusterrolebindings to all deployments even though
reconcilePlatformPermissions (implemented as a no-op in
platform_permissions_default.go for non-distro builds) never uses it; remove or
constrain that rule: either delete the clusterrolebindings rule from
files/kserve/resources.yaml and let distro overlays add it, or modify the rule
to restrict resourceNames to kserve-inferencegraph-auth-verifiers (matching the
existing constrained rule at lines 246–256), or add a conditional patch to strip
the rule for non-distro builds so only distro deployments retain the elevated
RBAC.

In `@config/rbac/role.yaml`:
- Around line 176-187: The ClusterRole currently grants broad write verbs on the
cluster-wide resource "clusterrolebindings" (see the resources: -
clusterrolebindings and verbs: - create - patch - update - delete), which allows
ServiceAccount compromise to escalate to cluster-admin; restrict this by
removing write verbs for "clusterrolebindings" from this base ClusterRole,
replace them with either (a) no access here and move any needed creation/update
of ClusterRoleBindings into a distro-specific overlay
ClusterRole/ClusterRoleBinding, or (b) if absolutely required, tighten to
resourceNames to only specific binding names and/or change to a namespaced
Role/RoleBinding where possible; update the resources/verbs block for the
functions that reference clusterrolebindings and ensure tests/overlays are
updated accordingly.

In `@hack/setup/quick-install/kserve-knative-mode-full-install-with-manifests.sh`:
- Around line 37922-37933: The RBAC rule currently grants broad write access to
ClusterRoleBinding objects (resource "clusterrolebindings" with verbs
create/update/delete), enabling privilege escalation; restrict it by removing
create/update/delete and instead allow only the minimal verbs (e.g.,
get/list/watch and only patch if absolutely required) or restrict write
operations to specific resourceNames (explicit ClusterRoleBinding names) or
namespace-scoped RoleBinding resources used by the controller. Locate the RBAC
rule referencing "clusterrolebindings" in the manifest and modify it to (a) drop
create/delete/update verbs, or (b) add a resourceNames list limiting which
ClusterRoleBinding(s) can be modified, or (c) change the resource to
"rolebindings" scoped to the controller's namespace if cluster-wide binding is
not needed; ensure any remaining patch/write permission is justified and
documented.

In
`@hack/setup/quick-install/kserve-standard-mode-full-install-with-manifests.sh`:
- Around line 37508-37519: The current RBAC rule grants broad mutation rights on
the "clusterrolebindings" resource (verbs: create, delete, patch, update) which
allows a compromised installer/service account to tamper with cluster-wide
bindings; restrict this by removing destructive verbs (delete, patch, update) or
scoping resourceNames to only the KServe-managed ClusterRoleBindings (e.g.,
names prefixed with your controller's name), or replace the ClusterRole rule
with a Role that targets only the namespace and RoleBindings if possible; locate
the RBAC block that lists "apiGroups: - rbac.authorization.k8s.io" and
"resources: - clusterrolebindings" and change the verbs/resourceNames
accordingly to limit mutation to only the specific KServe clusterrolebinding
names or drop mutation entirely.

In `@pkg/controller/v1beta1/inferenceservice/controller.go`:
- Around line 393-397: When reconcilePlatformPermissions fails inside the
deploymentMode branch (when deploymentMode == constants.Standard ||
deploymentMode == constants.LegacyRawDeployment), call r.updateStatus(ctx, isvc,
deploymentMode) to persist the latest status before returning the wrapped error;
i.e., update the status in the same way other error paths do (see ingress and
modelconfig error handling) and then return reconcile.Result{},
errors.Wrapf(err, "fails to reconcile platform permissions") so transient
CRB/permission failures don't leave stale status on the InferenceService.

In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go`:
- Around line 52-54: The crbName built as isvc.Namespace + "-" + saName +
"-auth-delegator can collide (e.g. "a"+"b-c" vs "a-b"+"c"); replace this plain
concatenation in the CRB name construction (the crbName variable that uses
isvc.Namespace and saName) with a deterministic, safe name: keep a short
readable prefix (e.g., isvc.Name or "isvc-auth-delegator") and append a hash
(e.g., SHA256/MD5 truncated hex) computed over the tuple isvc.Namespace + "|" +
saName (the pipe is only for hashing, not part of the final name) so the final
crbName is deterministic, unique, and avoids separator ambiguity while
respecting k8s name length limits.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go`:
- Line 1753: Tests now skip validating Route creation/spec/deletion and external
status URL because OCP helpers (used by assertPlatformIngressExists and related
helpers) are no-ops; update the OCP-specific helpers in
pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go so that
assertPlatformIngressExists, assertPlatformIngressDeleted, and
expectedIsvcStatusURL actually verify routev1.Route existence, its spec
(host/target), deletion, and return the external Route host instead of
cluster-local host, or alternatively add distro-only assertions in the
rawkube_controller_test helpers to explicitly assert route
creation/spec/deletion and the Route-backed status URL for RawOCPRouteReconciler
scenarios (reference the helper names assertPlatformIngressExists,
assertPlatformIngressDeleted, expectedIsvcStatusURL and the
RawOCPRouteReconciler behavior).

In
`@pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go`:
- Around line 86-102: When forceStopRuntime is true we delete the Route and set
IngressReady false but never clear isvc.Status.URL, leaving a stale host; inside
the forceStopRuntime block (after r.client.Delete and before returning) clear
the URL on the InferenceService status (set isvc.Status.URL to empty/nil) and
persist the change via the status client (r.client.Status().Update(ctx, isvc) or
the project’s status update helper) so callers that only inspect isvc.Status.URL
don’t see the old Route host; ensure this happens even when the Route was
already not found and keep the existing condition update (IngressReady /
StoppedISVCReason).
- Around line 317-334: The setRouteTargetPort function silently falls back to
the first port when no port is named "http"/"https", which can misroute traffic;
update setRouteTargetPort to (1) if svc.Spec.Ports is empty return the existing
error, (2) if there is exactly one port return it but emit a warning log
mentioning svc.Namespace/svc.Name and the chosen port, and (3) if there are
multiple ports and none match desiredName return an explicit error instead of
silently picking the first. Use the reconciler's logger (or existing logging
facility) to record the warning and reference setRouteTargetPort,
svc.Spec.Ports, and desiredName when implementing the change.
- Around line 202-228: In setAddress, the ClusterIPNone port append is lost when
authEnabled is true because host is recomputed and overwrites the earlier
change; fix by using a single baseHost (getRawServiceHost(isvc)) and conditional
logic: detect entryPointSvc.Spec.ClusterIP == corev1.ClusterIPNone first, then
set host based on authEnabled — if authEnabled set host = baseHost + ":" +
strconv.Itoa(constants.OauthProxyPort) and addrScheme = "https", else if
ClusterIPNone set host = baseHost + ":" +
constants.InferenceServiceDefaultHttpPort and addrScheme = "http"; update
isvc.Status.Address accordingly in setAddress so the ClusterIPNone handling
(constants.InferenceServiceDefaultHttpPort) is not overridden by the authEnabled
branch.
- Around line 362-366: semanticRouteEquals currently DeepEqual's full
Annotations which causes reconciliation loops because OpenShift's router adds
unmanaged annotations; change equality logic to compare only managed annotation
keys (the ones set by buildDesiredRoute such as
"haproxy.router.openshift.io/timeout") by filtering both desired.Annotations and
existing.Annotations to that managed-key subset before comparing. Update
semanticRouteEquals to use a helper that extracts only managed keys from the
annotation maps and compare those, and ensure code that applies annotations
(where existingRoute.Annotations is overwritten) only sets the managed keys
instead of replacing the entire map.

---

Nitpick comments:
In `@config/overlays/odh/rbac/inferenceservice/role.yaml`:
- Around line 6-18: The role grants full CRUD on clusterrolebindings (rules ->
resources: clusterrolebindings with verbs: create, delete, get, list, patch,
update, watch), which is broad and should be documented and risk-mitigated;
update the role manifest by adding a clear security rationale (e.g., as an
annotation on the Role/ClusterRole or a linked ADR) explaining why dynamic CRB
management is required for the inferenceservice controller, and harden the
controller service account by reducing scope where possible (restrict verbs
and/or add resourceNames if you can target specific CRBs, ensure the controller
runs with minimal RBAC/Node capabilities, and note these mitigations in the same
annotation or ADR).

In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp_test.go`:
- Around line 36-41: permTestScheme currently ignores errors from
runtime.NewScheme() registrations; update it to handle or fail fast on
registration errors by checking the returned errors from v1beta1.AddToScheme(s)
and rbacv1.AddToScheme(s) inside permTestScheme(), and either return the error
to the caller or panic/log.Fatalf with a clear message including the failing
function name (e.g., v1beta1.AddToScheme, rbacv1.AddToScheme) so scheme
registration failures are not silently suppressed.
- Around line 108-134: Add a new case to
TestReconcilePlatformPermissions_AuthEnabled_CRBDiffers that simulates a CRB
with a differing RoleRef: create existingCRB with the same Name
("ns1-default-auth-delegator") but a RoleRef.Name that is NOT
"system:auth-delegator" (and any Subjects), call reconcilePlatformPermissions,
then fetch the CRB and assert the RoleRef was replaced to the expected RoleRef
(APIGroup "rbac.authorization.k8s.io", Kind "ClusterRole", Name
"system:auth-delegator") and that Subjects/other expected fields were updated;
this will exercise the delete+recreate path in platform_permissions_ocp.go
(around the RoleRef diff logic).

In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go`:
- Around line 161-173: equalCRB currently does an order-sensitive, positional
comparison of ClusterRoleBinding Subjects which can trigger needless updates if
the slice is reordered; modify equalCRB to compare Subjects in an
order-insensitive way by either sorting the Subjects before comparison or
building a set/map keyed by stable identity (e.g., Subject.Kind +
Subject.APIGroup + Subject.Namespace + Subject.Name) and comparing entries,
while still comparing RoleRef as before; update equalCRB to use that
set-based/sorted comparison so identical subject lists in different orders are
treated as equal.
- Around line 99-107: The current logic deletes and recreates the
ClusterRoleBinding when existing.RoleRef != desired.RoleRef which creates a race
window (crbName, cl.Delete, cl.Create, existing.RoleRef, desired.RoleRef);
instead, modify the reconcile to perform an in-place update (fetch existing,
deep-copy, set existing.RoleRef = desired.RoleRef and call cl.Update or use a
server-side apply/patch) wrapped in a retry-on-conflict loop (e.g.,
RetryOnConflict) so concurrent reconciles safely reconcile RoleRef without
removing the binding; handle update errors and fall back to recreate only if
update consistently fails after retries.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go`:
- Around line 53-54: The test helper assertPlatformIngressSpec currently
declares an unused parameter `_ *v1beta1.InferenceService`; remove that unused
parameter from the function signature of assertPlatformIngressSpec and from its
counterpart in the OCP variant so both build-tag variants stay in sync, then
update any call sites to stop passing the extra argument (or alternatively, if
the InferenceService will be needed, use the parameter inside
assertPlatformIngressSpec in both variants); reference the symbol
assertPlatformIngressSpec and the unused parameter `_ *v1beta1.InferenceService`
when making the change.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go`:
- Around line 199-405: The test embeds a huge inline Deployment struct in
expectedDeployment which harms maintainability; extract that literal into a
helper (e.g., makeExpectedDeployment or loadExpectedDeploymentFixture) that
accepts the dynamic inputs used in the test (predictorDeploymentKey, serviceKey,
isvc, replicas, defaultResource, gracePeriod, revisionHistory,
progressDeadlineSeconds) and returns *appsv1.Deployment, then replace the inline
expectedDeployment with a call to that helper or load from a small YAML/JSON
fixture; ensure the helper constructs the same fields referenced in assertions
(metadata name/namespace, pod template labels/annotations, containers including
InferenceServiceContainerName and KubeRbacContainerName, volumes, probes,
resources, and Strategy) so future changes to defaults can be updated in one
place.
- Around line 104-105: Replace all uses of context.TODO() in this test with the
existing ctx variable (context.Background()) to ensure consistent context usage;
for example change k8sClient.Create(context.TODO(), configMap) and
k8sClient.Delete(context.TODO(), configMap) as well as other k8sClient calls
that pass context.TODO() to use ctx instead, and update any helper calls within
the test that currently pass context.TODO() to accept ctx so the same context is
used throughout.

In
`@pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler_test.go`:
- Around line 173-199: The test TestRawOCPRoute_ExposedWithAuth pre-sets the
route TLS termination to routev1.TLSTerminationReencrypt even though
admittedRoute already sets TLSTerminationEdge, and the test never asserts that
the reconciler updates the Route resource to the expected TLS when auth is
enabled; update the test to (1) ensure the initial route state is clear (use
admittedRoute as-is or explicitly set TLS to the scenario you want), (2) after
calling newOCPReconciler(...).Reconcile(...), fetch or inspect the
reconciler-updated Route object and assert route.Spec.TLS.Termination equals the
expected value for auth-enabled flows, and (3) keep the existing ISVC status
assertions (isvc.Status.URL and isvc.Status.Address checks) while adding the new
assertion referencing admittedRoute, TestRawOCPRoute_ExposedWithAuth,
route.Spec.TLS.Termination, newOCPReconciler, and constants.OauthProxyPort so
the test verifies the Route is corrected by the reconciler.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 66777641-ee1a-43bc-a102-9ba618bce980

📥 Commits

Reviewing files that changed from the base of the PR and between 94b5e69 and af145d0.

📒 Files selected for processing (24)
  • Makefile.overrides.mk
  • charts/kserve-resources/files/kserve/resources.yaml
  • config/overlays/odh/rbac/inferenceservice/role.yaml
  • config/rbac/role.yaml
  • docs/superpowers/plans/2026-04-13-route-crb-distro-hooks-progress.md
  • hack/setup/quick-install/kserve-knative-mode-full-install-with-manifests.sh
  • hack/setup/quick-install/kserve-standard-mode-full-install-with-manifests.sh
  • pkg/controller/v1beta1/inferenceservice/controller.go
  • pkg/controller/v1beta1/inferenceservice/controller_setup_default.go
  • pkg/controller/v1beta1/inferenceservice/controller_setup_ocp.go
  • pkg/controller/v1beta1/inferenceservice/distro/controller_rbac_ocp.go
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_default.go
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp_test.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory_default.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory_ocp.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/kube_ingress_reconciler.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler_test.go
  • pkg/controller/v1beta1/inferenceservice/utils/utils.go
💤 Files with no reviewable changes (1)
  • pkg/controller/v1beta1/inferenceservice/utils/utils.go

Comment thread charts/kserve-resources/files/kserve/resources.yaml Outdated
Comment thread config/rbac/role.yaml Outdated
Comment thread hack/setup/quick-install/kserve-knative-mode-full-install-with-manifests.sh Outdated
Comment thread hack/setup/quick-install/kserve-standard-mode-full-install-with-manifests.sh Outdated
Comment thread pkg/controller/v1beta1/inferenceservice/controller.go
Comment thread pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go
@bartoszmajsak
Copy link
Copy Markdown

/group-test
/retest-required

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details
{}

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
pkg/controller/v1beta1/inferenceservice/controller.go (1)

746-748: Minor inconsistency: fmt.Errorf vs errors.Wrapf.

This segment uses fmt.Errorf("...: %w", err) while the rest of the file uses errors.Wrapf(err, "...") from github.com/pkg/errors (e.g., lines 141, 374, 383). Both work, but mixing styles reduces grep-ability and stack trace consistency.

Proposed alignment with existing style
 	if err := extendControllerSetup(mgr, ctrlBuilder); err != nil {
-		return fmt.Errorf("failed to extend controller setup: %w", err)
+		return errors.Wrap(err, "failed to extend controller setup")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/controller.go` around lines 746 -
748, Replace the fmt.Errorf usage with the project's standard errors.Wrapf style
to keep error-wrapping and stack traces consistent; specifically update the
return in the extendControllerSetup call to use errors.Wrapf(err, "failed to
extend controller setup") (the call referencing extendControllerSetup, mgr and
ctrlBuilder) and ensure github.com/pkg/errors is imported if not already.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go`:
- Around line 79-84: When reconciling Routes (existingRoute retrieved via
r.client.Get for the InferenceService name), if a Route already exists but is
not owned by the current InferenceService (check owner references/ControllerRef
on existingRoute versus isvc UID), do not update or adopt it; instead reject the
reconcile by returning an error or requeue (e.g., fmt.Errorf("route %s/%s exists
and is not owned by InferenceService", existingRoute.Namespace,
existingRoute.Name)). Apply the same ownership guard wherever you create/update
Routes in this reconciler (the blocks around the initial Get and the
create/update logic referenced later in the file) so you only set
OwnerReferences or mutate Routes that are already controlled by the
InferenceService.
- Around line 126-136: When a Route is created or observed as not admitted (in
the branch handling routeNotFound and the similar branch around lines 154-161),
clear the stale external endpoint by resetting isvc.Status.URL to nil/empty
before setting the IngressReady condition and requeuing; update the code paths
around the creation logic that reference desiredRoute and the
isvc.Status.SetCondition(v1beta1.IngressReady, ...) call to explicitly remove
any previous isvc.Status.URL so the service stops publishing an out-of-date
ingress URL while the Route is not admitted.
- Around line 139-147: The code currently only merges desiredRoute.Annotations
into existingRoute.Annotations which means removed keys (e.g.
haproxy.router.openshift.io/timeout) never get deleted; update the
reconciliation in the block guarded by semanticRouteEquals (where
existingRoute.Spec/Labels are assigned) to fully reconcile annotations instead
of only merging: ensure existingRoute.Annotations contains exactly the
keys/values present in desiredRoute.Annotations for managed keys by removing any
keys present on existingRoute but absent from desiredRoute (or replace
existingRoute.Annotations with a shallow copy of desiredRoute.Annotations if no
other non-managed annotations must be preserved); apply the same change to the
other analogous reconciliation sites that use the same merge pattern.

---

Nitpick comments:
In `@pkg/controller/v1beta1/inferenceservice/controller.go`:
- Around line 746-748: Replace the fmt.Errorf usage with the project's standard
errors.Wrapf style to keep error-wrapping and stack traces consistent;
specifically update the return in the extendControllerSetup call to use
errors.Wrapf(err, "failed to extend controller setup") (the call referencing
extendControllerSetup, mgr and ctrlBuilder) and ensure github.com/pkg/errors is
imported if not already.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: e117d6be-d67b-425a-88a0-6f900926b51c

📥 Commits

Reviewing files that changed from the base of the PR and between af145d0 and 7fff852.

📒 Files selected for processing (3)
  • Makefile
  • pkg/controller/v1beta1/inferenceservice/controller.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go

Comment on lines +79 to +84
existingRoute := &routev1.Route{}
getErr := r.client.Get(ctx, types.NamespacedName{Name: isvc.Name, Namespace: isvc.Namespace}, existingRoute)
routeNotFound := apierr.IsNotFound(getErr)
if getErr != nil && !routeNotFound {
return ctrl.Result{}, fmt.Errorf("failed to get existing Route: %w", getErr)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against same-name Route takeover.

If a Route named isvc.Name already exists but is not controlled by this InferenceService, this code still updates it and can mark the ISVC ready from that foreign object's admission state. Reject the reconcile instead of mutating or trusting an unowned Route.

Suggested fix
 existingRoute := &routev1.Route{}
 getErr := r.client.Get(ctx, types.NamespacedName{Name: isvc.Name, Namespace: isvc.Namespace}, existingRoute)
 routeNotFound := apierr.IsNotFound(getErr)
 if getErr != nil && !routeNotFound {
 	return ctrl.Result{}, fmt.Errorf("failed to get existing Route: %w", getErr)
 }
+if !routeNotFound {
+	controller := metav1.GetControllerOf(existingRoute)
+	if controller == nil || controller.UID != isvc.UID {
+		return ctrl.Result{}, fmt.Errorf(
+			"route %s/%s already exists and is not controlled by InferenceService %s/%s",
+			existingRoute.Namespace, existingRoute.Name, isvc.Namespace, isvc.Name,
+		)
+	}
+}

As per coding guidelines, "Set OwnerReferences on all child resources".

Also applies to: 126-175

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go`
around lines 79 - 84, When reconciling Routes (existingRoute retrieved via
r.client.Get for the InferenceService name), if a Route already exists but is
not owned by the current InferenceService (check owner references/ControllerRef
on existingRoute versus isvc UID), do not update or adopt it; instead reject the
reconcile by returning an error or requeue (e.g., fmt.Errorf("route %s/%s exists
and is not owned by InferenceService", existingRoute.Namespace,
existingRoute.Name)). Apply the same ownership guard wherever you create/update
Routes in this reconciler (the blocks around the initial Get and the
create/update logic referenced later in the file) so you only set
OwnerReferences or mutate Routes that are already controlled by the
InferenceService.

@bartoszmajsak
Copy link
Copy Markdown

/retest-required

@bartoszmajsak
Copy link
Copy Markdown

/group-test

VedantMahabaleshwarkar added a commit to VedantMahabaleshwarkar/odh-model-controller that referenced this pull request Apr 22, 2026
KServe PR opendatahub-io/kserve#1404 moves OpenShift Route creation and
system:auth-delegator ClusterRoleBinding management for raw
InferenceServices into kserve behind distro build tags. This removes the
now-redundant reconcilers from odh-model-controller to avoid
dual-controller conflicts (infinite reconciliation loops from label
fights) once both ship in the same ODH release.

Removed:
- KserveRawRouteReconciler and KserveRawClusterRoleBindingReconciler
- Supporting handlers (RouteHandler, ClusterRoleBindingHandler)
- Comparators (route, clusterrolebinding)
- SetOpenshiftRouteTimeoutForIsvc utility
- Integration tests for Route/CRB lifecycle
- Owns watches for Route and ClusterRoleBinding
- Orphaned constants (KserveNetworkVisibility, LabelEnableKserveRawRoute,
  KserveServiceAccountName, DefaultOpenshiftRouteTimeout,
  RouteTimeoutAnnotationKey)

Retained:
- RBAC markers for routes and clusterrolebindings (cleanup deferred)
- Metrics, ServiceMonitor, Dashboard, and KEDA reconcilers
- Finalizer and namespace cleanup orchestration
- Scheme registration for routev1

Ref: RHOAIENG-56270

Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go (1)

79-84: ⚠️ Potential issue | 🟠 Major

Route ownership not validated before update path (CWE-284: Improper Access Control).

When a Route already exists (line 139), the code updates it without verifying ownership. If a Route with the same name exists but is owned by another controller/resource, this reconciler will overwrite it, potentially disrupting other workloads.

Past review flagged this — still unaddressed.

Add ownership check before update
 existingRoute := &routev1.Route{}
 getErr := r.client.Get(ctx, types.NamespacedName{Name: isvc.Name, Namespace: isvc.Namespace}, existingRoute)
 routeNotFound := apierr.IsNotFound(getErr)
 if getErr != nil && !routeNotFound {
 	return ctrl.Result{}, fmt.Errorf("failed to get existing Route: %w", getErr)
 }
+if !routeNotFound {
+	if controller := metav1.GetControllerOf(existingRoute); controller == nil || controller.UID != isvc.UID {
+		return ctrl.Result{}, fmt.Errorf("route %s/%s exists but is not owned by InferenceService %s",
+			existingRoute.Namespace, existingRoute.Name, isvc.Name)
+	}
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go`
around lines 79 - 84, The reconciler updates an existing Route (existingRoute)
without verifying ownership, so before performing any update/patch (the code
path after r.client.Get where existingRoute is modified) check that the Route is
owned by the InferenceService (isvc) by inspecting existingRoute.OwnerReferences
(or using metav1.IsControlledBy / controllerutil.ReferencesEqual) and matching
the owner UID/Kind/Name to isvc; if the Route is not controlled by isvc, do not
update it — instead return a clear error or requeue with no-op to avoid
overwriting someone else’s resource. Ensure this ownership check occurs
immediately after fetching existingRoute and before any r.client.Update /
r.client.Patch calls.
🧹 Nitpick comments (4)
pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go (2)

407-409: Assertion message will show raw diff on failure.

cmp.Diff output is passed to Equal() matcher's diff message, but on equality the diff is empty string. Consider using a custom matcher or separate assertion for clearer failure output.

Clearer assertion pattern
-		Expect(actualDeployment.Spec).To(Equal(expectedDeployment.Spec),
-			cmp.Diff(expectedDeployment.Spec, actualDeployment.Spec, cmpopts.SortMaps(func(a, b string) bool { return a < b })))
+		diff := cmp.Diff(expectedDeployment.Spec, actualDeployment.Spec, cmpopts.SortMaps(func(a, b string) bool { return a < b }))
+		Expect(diff).To(BeEmpty(), "Deployment spec mismatch")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go` around
lines 407 - 409, The test currently calls
Expect(actualDeployment.Spec).To(Equal(expectedDeployment.Spec), cmp.Diff(...))
which hides the diff on success; replace this with an assertion that explicitly
checks the cmp.Diff result so failures show a meaningful diff: compute diff :=
cmp.Diff(expectedDeployment.Spec, actualDeployment.Spec, cmpopts.SortMaps(...))
and assert Expect(diff).To(BeEmpty(), "deployment spec mismatch:\n%s", diff) (or
alternatively assert Expect(diff).To(Equal(""), ...)). Update references to
actualDeployment.Spec, expectedDeployment.Spec, cmp.Diff, and cmpopts.SortMaps
accordingly.

104-105: Inconsistent context usage.

Line 104 uses context.TODO() while the surrounding test uses ctx := context.Background() (line 145). Use consistent context throughout for clearer test debugging.

Use consistent context
-		Expect(k8sClient.Create(context.TODO(), configMap)).NotTo(HaveOccurred())
-		defer k8sClient.Delete(context.TODO(), configMap)
+		Expect(k8sClient.Create(ctx, configMap)).NotTo(HaveOccurred())
+		defer k8sClient.Delete(ctx, configMap)

Note: ctx is declared later at line 145. Either move the ctx declaration earlier or use context.Background() consistently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go` around
lines 104 - 105, The test uses context.TODO() for k8sClient.Create/Delete but
the rest of the test uses ctx := context.Background(); make this consistent by
using the same ctx variable for those calls: either move the ctx :=
context.Background() declaration earlier so k8sClient.Create(context.TODO(),
configMap) and defer k8sClient.Delete(...) become k8sClient.Create(ctx,
configMap) and defer k8sClient.Delete(ctx, configMap), or replace the later ctx
declaration with context.Background() and keep the original calls; update the
k8sClient.Create and defer k8sClient.Delete invocations accordingly and ensure
the unique identifiers k8sClient.Create, k8sClient.Delete and ctx are used
consistently.
pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go (2)

161-173: Subject comparison is order-sensitive.

equalCRB compares subjects by index position. Currently safe since each CRB has exactly one subject, but if multi-subject scenarios are introduced, this will cause unnecessary updates when subjects are reordered.

Optional: Use set-based comparison for future-proofing
 func equalCRB(a, b *rbacv1.ClusterRoleBinding) bool {
 	if a.RoleRef != b.RoleRef {
 		return false
 	}
 	if len(a.Subjects) != len(b.Subjects) {
 		return false
 	}
-	for i := range a.Subjects {
-		if a.Subjects[i] != b.Subjects[i] {
+	subjectSet := make(map[rbacv1.Subject]struct{}, len(a.Subjects))
+	for _, s := range a.Subjects {
+		subjectSet[s] = struct{}{}
+	}
+	for _, s := range b.Subjects {
+		if _, ok := subjectSet[s]; !ok {
 			return false
 		}
 	}
 	return true
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go` around
lines 161 - 173, The equalCRB function currently compares
rbacv1.ClusterRoleBinding.RoleRef by value and compares a.Subjects to b.Subjects
by index, which is order-sensitive; change equalCRB to compare subjects as an
order-independent set (e.g., build a map keyed by each Subject's canonical
identity such as Kind/Name/Namespace/APIGroup, verify both maps have same size
and same keys) so reorders don't trigger updates; keep the existing RoleRef
comparison but replace the indexed loop over a.Subjects/b.Subjects with the
set/map-based membership checks.

67-83: ClusterRoleBinding lacks OwnerReference — manual cleanup required on ISVC deletion.

ClusterRoleBindings are cluster-scoped and cannot have namespace-scoped OwnerReferences. The current implementation relies on otherISVCsNeedAuthCRB during reconciliation, but if all ISVCs in a namespace are deleted simultaneously or the controller is unavailable, orphaned CRBs may persist.

Consider adding a finalizer to the InferenceService to ensure CRB cleanup during deletion, or document the orphan risk. As per coding guidelines, "Set OwnerReferences on all child resources" — this is a known limitation for cluster-scoped resources.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go` around
lines 67 - 83, The ClusterRoleBinding created in platform_permissions_ocp.go
(constructed as desired with crbName, Subjects using saName and isvc.Namespace,
RoleRef to system:auth-delegator) is cluster-scoped and cannot use namespace
OwnerReferences, so add a deletion safeguard: implement a finalizer on the
InferenceService (e.g., add/remove a finalizer string in the Reconcile flow) and
update the deletion handling to explicitly delete the ClusterRoleBinding when
the InferenceService is being deleted (use the same crbName lookup and ensure
otherISVCsNeedAuthCRB is consulted before finalizer removal), or alternatively
document the orphan risk; ensure you set/clear the finalizer in create/update
paths and only remove it after successfully deleting the ClusterRoleBinding.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go`:
- Around line 139-147: The update path currently only merges
desiredRoute.Annotations into existingRoute.Annotations which causes removed
annotations to persist; change the logic so existingRoute.Annotations are
synchronized bidirectionally with desiredRoute.Annotations (i.e., replace or
prune existing keys not present in desired and handle nil desired annotations by
clearing existing annotations), and update semanticRouteEquals to check for
exact equality of annotation sets (both presence and values) rather than only
ensuring desired keys exist in existing; use the existingRoute and desiredRoute
variables and the semanticRouteEquals function to locate and apply this fix.
- Around line 126-136: When the Route is created but not yet admitted
(routeNotFound branch creating desiredRoute) and when admission fails (the block
setting IngressReady to False with Reason "RouteNotAdmitted"), clear the stale
isvc.Status.URL before returning; update the same places where you set
isvc.Status.SetCondition(..., Reason: "RouteNotAdmitted") to also set
isvc.Status.URL = nil or zero-value so previously-admitted URLs aren't left
visible. Locate these changes around the route creation branch (references:
routeNotFound, desiredRoute, routeAdmissionRequeueDelay) and the
admission-failure branch (references: IngressReady condition, Reason
"RouteNotAdmitted") and ensure the status URL is cleared prior to returning with
the requeue result.

---

Duplicate comments:
In
`@pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go`:
- Around line 79-84: The reconciler updates an existing Route (existingRoute)
without verifying ownership, so before performing any update/patch (the code
path after r.client.Get where existingRoute is modified) check that the Route is
owned by the InferenceService (isvc) by inspecting existingRoute.OwnerReferences
(or using metav1.IsControlledBy / controllerutil.ReferencesEqual) and matching
the owner UID/Kind/Name to isvc; if the Route is not controlled by isvc, do not
update it — instead return a clear error or requeue with no-op to avoid
overwriting someone else’s resource. Ensure this ownership check occurs
immediately after fetching existingRoute and before any r.client.Update /
r.client.Patch calls.

---

Nitpick comments:
In `@pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go`:
- Around line 161-173: The equalCRB function currently compares
rbacv1.ClusterRoleBinding.RoleRef by value and compares a.Subjects to b.Subjects
by index, which is order-sensitive; change equalCRB to compare subjects as an
order-independent set (e.g., build a map keyed by each Subject's canonical
identity such as Kind/Name/Namespace/APIGroup, verify both maps have same size
and same keys) so reorders don't trigger updates; keep the existing RoleRef
comparison but replace the indexed loop over a.Subjects/b.Subjects with the
set/map-based membership checks.
- Around line 67-83: The ClusterRoleBinding created in
platform_permissions_ocp.go (constructed as desired with crbName, Subjects using
saName and isvc.Namespace, RoleRef to system:auth-delegator) is cluster-scoped
and cannot use namespace OwnerReferences, so add a deletion safeguard: implement
a finalizer on the InferenceService (e.g., add/remove a finalizer string in the
Reconcile flow) and update the deletion handling to explicitly delete the
ClusterRoleBinding when the InferenceService is being deleted (use the same
crbName lookup and ensure otherISVCsNeedAuthCRB is consulted before finalizer
removal), or alternatively document the orphan risk; ensure you set/clear the
finalizer in create/update paths and only remove it after successfully deleting
the ClusterRoleBinding.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go`:
- Around line 407-409: The test currently calls
Expect(actualDeployment.Spec).To(Equal(expectedDeployment.Spec), cmp.Diff(...))
which hides the diff on success; replace this with an assertion that explicitly
checks the cmp.Diff result so failures show a meaningful diff: compute diff :=
cmp.Diff(expectedDeployment.Spec, actualDeployment.Spec, cmpopts.SortMaps(...))
and assert Expect(diff).To(BeEmpty(), "deployment spec mismatch:\n%s", diff) (or
alternatively assert Expect(diff).To(Equal(""), ...)). Update references to
actualDeployment.Spec, expectedDeployment.Spec, cmp.Diff, and cmpopts.SortMaps
accordingly.
- Around line 104-105: The test uses context.TODO() for k8sClient.Create/Delete
but the rest of the test uses ctx := context.Background(); make this consistent
by using the same ctx variable for those calls: either move the ctx :=
context.Background() declaration earlier so k8sClient.Create(context.TODO(),
configMap) and defer k8sClient.Delete(...) become k8sClient.Create(ctx,
configMap) and defer k8sClient.Delete(ctx, configMap), or replace the later ctx
declaration with context.Background() and keep the original calls; update the
k8sClient.Create and defer k8sClient.Delete invocations accordingly and ensure
the unique identifiers k8sClient.Create, k8sClient.Delete and ctx are used
consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 1f512a26-c5ad-4c82-a070-975ebcfffaab

📥 Commits

Reviewing files that changed from the base of the PR and between 7fff852 and 9e2dc5b.

📒 Files selected for processing (21)
  • Makefile
  • Makefile.overrides.mk
  • config/overlays/odh/rbac/inferenceservice/role.yaml
  • docs/superpowers/plans/2026-04-13-route-crb-distro-hooks-progress.md
  • pkg/controller/v1beta1/inferenceservice/controller.go
  • pkg/controller/v1beta1/inferenceservice/controller_setup_default.go
  • pkg/controller/v1beta1/inferenceservice/controller_setup_ocp.go
  • pkg/controller/v1beta1/inferenceservice/distro/controller_rbac_ocp.go
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_default.go
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp.go
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp_test.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory_default.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory_ocp.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/kube_ingress_reconciler.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler_test.go
  • pkg/controller/v1beta1/inferenceservice/utils/utils.go
💤 Files with no reviewable changes (1)
  • pkg/controller/v1beta1/inferenceservice/utils/utils.go
✅ Files skipped from review due to trivial changes (9)
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_default.go
  • pkg/controller/v1beta1/inferenceservice/controller_setup_default.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory_default.go
  • pkg/controller/v1beta1/inferenceservice/controller.go
  • config/overlays/odh/rbac/inferenceservice/role.yaml
  • pkg/controller/v1beta1/inferenceservice/distro/controller_rbac_ocp.go
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_ocp_test.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ocp_route_reconciler_test.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • Makefile
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory_ocp.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory.go

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go (1)

56-58: Avoid single-shot spec comparison after fetch; assert spec convergence with polling

Line 108 performs a one-time equality check after only waiting for existence. In async reconciliation flows, this can flake when object creation precedes final spec update. Poll the spec equality itself.

Proposed refactor
 func assertPlatformIngressSpec(ctx context.Context, serviceKey types.NamespacedName, serviceName string, _ *v1beta1.InferenceService) {
 	pathType := netv1.PathTypePrefix
-	actualIngress := &netv1.Ingress{}
-	Eventually(func() error {
-		return k8sClient.Get(ctx, serviceKey, actualIngress)
-	}, timeout, interval).Should(Succeed())
-
 	expectedIngress := netv1.Ingress{
 		Spec: netv1.IngressSpec{
 			Rules: []netv1.IngressRule{
@@
 			},
 		},
 	}
-	Expect(actualIngress.Spec).To(Equal(expectedIngress.Spec))
+	Eventually(func() netv1.IngressSpec {
+		actualIngress := &netv1.Ingress{}
+		Expect(k8sClient.Get(ctx, serviceKey, actualIngress)).To(Succeed())
+		return actualIngress.Spec
+	}, timeout, interval).Should(Equal(expectedIngress.Spec))
 }

As per coding guidelines, "REVIEW PRIORITIES: ... 3. Bug-prone patterns and error handling gaps".

Also applies to: 108-108

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go`
around lines 56 - 58, The test currently only polls for the resource to exist
then performs a single-shot spec equality which can flake; change the one-time
equality to poll until the spec converges by wrapping the Get+comparison inside
an Eventually that retries with the same timeout and interval. Specifically,
replace the single equality/assert after the initial Eventually that fetches
actualIngress with an Eventually(func() bool { if err := k8sClient.Get(ctx,
serviceKey, actualIngress); err != nil { return false }; return
reflect.DeepEqual(actualIngress.Spec, expectedIngress.Spec) }, timeout,
interval).Should(BeTrue()) (or the Gomega equivalent you use) so the test keeps
fetching actualIngress and asserts actualIngress.Spec equals
expectedIngress.Spec until it converges. Ensure you reference actualIngress,
serviceKey, k8sClient, timeout and interval in the new poll.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go`:
- Around line 40-43: Change the Consistently checks to return errors instead of
bools so unexpected API/client errors surface to Gomega: replace the bool lambda
that calls k8sClient.Get and uses apierr.IsNotFound with a func() error that
returns nil when apierr.IsNotFound(err) (success) and returns the actual err for
any other case, then assert with .Should(Succeed(), "Ingress %s should not
exist", serviceKey.Name); apply the same change to the other Consistently block
that checks Ingress existence so k8sClient.Get, apierr.IsNotFound, and the
Consistently matcher expose real errors rather than collapsing to false.

---

Nitpick comments:
In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go`:
- Around line 56-58: The test currently only polls for the resource to exist
then performs a single-shot spec equality which can flake; change the one-time
equality to poll until the spec converges by wrapping the Get+comparison inside
an Eventually that retries with the same timeout and interval. Specifically,
replace the single equality/assert after the initial Eventually that fetches
actualIngress with an Eventually(func() bool { if err := k8sClient.Get(ctx,
serviceKey, actualIngress); err != nil { return false }; return
reflect.DeepEqual(actualIngress.Spec, expectedIngress.Spec) }, timeout,
interval).Should(BeTrue()) (or the Gomega equivalent you use) so the test keeps
fetching actualIngress and asserts actualIngress.Spec equals
expectedIngress.Spec until it converges. Ensure you reference actualIngress,
serviceKey, k8sClient, timeout and interval in the new poll.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 8ab0dfa1-475c-4e96-8ff6-2ac5050175ff

📥 Commits

Reviewing files that changed from the base of the PR and between 9e2dc5b and 7331b8b.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum, !**/*.sum
📒 Files selected for processing (5)
  • pkg/controller/v1beta1/inferenceservice/controller_setup_default.go
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_default.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory_default.go
✅ Files skipped from review due to trivial changes (4)
  • pkg/controller/v1beta1/inferenceservice/controller_setup_default.go
  • pkg/controller/v1beta1/inferenceservice/platform_permissions_default.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/factory_default.go
  • pkg/controller/v1beta1/inferenceservice/rawkube_platform_ocp_test.go

Comment on lines +40 to +43
Consistently(func() bool {
err := k8sClient.Get(ctx, serviceKey, &netv1.Ingress{})
return apierr.IsNotFound(err)
}, fastTimeout, interval).Should(BeTrue(), "Ingress %s should not exist", serviceKey.Name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "rawkube_platform_default_test.go" | head -20

Repository: opendatahub-io/kserve

Length of output: 139


🏁 Script executed:

cat -n ./pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go | sed -n '35,55p'

Repository: opendatahub-io/kserve

Length of output: 1124


🏁 Script executed:

cat -n ./pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go | sed -n '1,20p'

Repository: opendatahub-io/kserve

Length of output: 817


🏁 Script executed:

cat -n ./pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go | sed -n '20,32p'

Repository: opendatahub-io/kserve

Length of output: 381


🏁 Script executed:

wc -l ./pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go

Repository: opendatahub-io/kserve

Length of output: 143


🏁 Script executed:

cat -n ./pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go | sed -n '53,90p'

Repository: opendatahub-io/kserve

Length of output: 1569


🏁 Script executed:

cat -n ./pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go

Repository: opendatahub-io/kserve

Length of output: 4400


Return errors from assertion helpers instead of collapsing to boolean

Lines 40–43 and 47–50 mask API failures (client errors, RBAC denials, network issues) as a simple false return. This makes test failures ambiguous and hides real regressions. Keep NotFound as success, but surface unexpected errors directly to the Gomega matcher.

Proposed fix
 func assertPlatformIngressDoesNotExist(ctx context.Context, serviceKey types.NamespacedName) {
-	Consistently(func() bool {
-		err := k8sClient.Get(ctx, serviceKey, &netv1.Ingress{})
-		return apierr.IsNotFound(err)
-	}, fastTimeout, interval).Should(BeTrue(), "Ingress %s should not exist", serviceKey.Name)
+	Consistently(func() error {
+		err := k8sClient.Get(ctx, serviceKey, &netv1.Ingress{})
+		if apierr.IsNotFound(err) {
+			return nil
+		}
+		if err != nil {
+			return err
+		}
+		return fmt.Errorf("ingress %s/%s unexpectedly exists", serviceKey.Namespace, serviceKey.Name)
+	}, fastTimeout, interval).Should(Succeed())
 }
 
 func assertPlatformIngressDeleted(ctx context.Context, serviceKey types.NamespacedName) {
-	Eventually(func() bool {
-		err := k8sClient.Get(ctx, serviceKey, &netv1.Ingress{})
-		return apierr.IsNotFound(err)
-	}, timeout, interval).Should(BeTrue(), "Ingress %s should be deleted", serviceKey.Name)
+	Eventually(func() error {
+		err := k8sClient.Get(ctx, serviceKey, &netv1.Ingress{})
+		if apierr.IsNotFound(err) {
+			return nil
+		}
+		if err != nil {
+			return err
+		}
+		return fmt.Errorf("ingress %s/%s still exists", serviceKey.Namespace, serviceKey.Name)
+	}, timeout, interval).Should(Succeed())
 }

Also applies to lines 47–50.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/v1beta1/inferenceservice/rawkube_platform_default_test.go`
around lines 40 - 43, Change the Consistently checks to return errors instead of
bools so unexpected API/client errors surface to Gomega: replace the bool lambda
that calls k8sClient.Get and uses apierr.IsNotFound with a func() error that
returns nil when apierr.IsNotFound(err) (success) and returns the actual err for
any other case, then assert with .Should(Succeed(), "Ingress %s should not
exist", serviceKey.Name); apply the same change to the other Consistently block
that checks Ingress existence so k8sClient.Get, apierr.IsNotFound, and the
Consistently matcher expose real errors rather than collapsing to false.

@VedantMahabaleshwarkar
Copy link
Copy Markdown
Author

/retest

…ro build tags

Moves OpenShift Route creation and auth-delegation ClusterRoleBinding
management for raw InferenceServices from odh-model-controller into
kserve behind `//go:build distro` tags using three hook pairs
(_default.go / _ocp.go):

1. Factory hook — selects RawOCPRouteReconciler instead of
   RawIngressReconciler in distro builds (when Gateway API is disabled)
2. Controller setup hook — registers Route API scheme and Owns watch
3. Platform permissions hook — manages system:auth-delegator CRBs
   for auth-enabled ISVCs

Reverts shared files (kube_ingress_reconciler.go, utils.go) to match
upstream kserve/master, eliminating ODH-specific modifications that
caused rebase friction.

Adapts rawkube_controller_test.go for the distro pattern by splitting
platform-specific ingress assertions into build-tagged test helper
files. Moves ODH-only auth tests to the distro test file. All shared
tests use higher-level helpers that are no-ops in distro mode for
cluster-local ISVCs (which don't create Routes).

Includes 18 unit tests (10 Route reconciler, 8 CRB) and fixes
pre-existing unparam lint warnings in distro test files.

Ref: RHOAIENG-56270
- Fix annotation reconciliation loop: compare only managed annotations
  in semanticRouteEquals and merge (not replace) annotations on update,
  preventing infinite loop with OpenShift router controller
- Remove unconstrained clusterrolebindings RBAC from base role by fixing
  Makefile controller-gen path to not recurse into distro/ subpackage
- Clear isvc.Status.URL and Address on force-stop to avoid stale data
- Add status update on reconcilePlatformPermissions failure for consistency
- Add warning log when setRouteTargetPort falls back to first port
…otations

- Change route timeout from max(predictor, transformer) to
  sum(predictor, transformer, explainer) to account for full
  request chain latency
- Clear isvc.Status.URL when Route is not yet admitted to
  prevent stale URLs from being visible to clients
- Reconcile managed Route annotations bi-directionally so
  removed annotations (e.g. haproxy timeout) are pruned from
  existing Routes instead of persisting forever
Wire the inferenceservice distro RBAC into overlays/odh so kustomize output includes the missing clusterrolebinding permissions required by the controller; include precommit-generated go.sum updates.

Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
@VedantMahabaleshwarkar
Copy link
Copy Markdown
Author

/retest

@rhods-ci-bot
Copy link
Copy Markdown

@VedantMahabaleshwarkar: The following test has Failed:

OCI Artifact Browser URL

View in Artifact Browser

Inspecting Test Artifacts Manually

To inspect your test artifacts manually, follow these steps:

  1. Install ORAS (see the ORAS installation guide).
  2. Download artifacts with the following commands:
mkdir -p oras-artifacts
cd oras-artifacts
oras pull quay.io/opendatahub/odh-ci-artifacts:kserve-group-test-78f86

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 28, 2026

@VedantMahabaleshwarkar: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-llm-inference-service b08f02d link true /test e2e-llm-inference-service

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: New/Backlog

Development

Successfully merging this pull request may close these issues.

4 participants