Skip to content

fix(cypress): wait for RBAC propagation before asserting project access#7381

Open
kanishka-commits wants to merge 2 commits intoopendatahub-io:mainfrom
kanishka-commits:fix/rbac-prop
Open

fix(cypress): wait for RBAC propagation before asserting project access#7381
kanishka-commits wants to merge 2 commits intoopendatahub-io:mainfrom
kanishka-commits:fix/rbac-prop

Conversation

@kanishka-commits
Copy link
Copy Markdown
Contributor

@kanishka-commits kanishka-commits commented Apr 24, 2026

Description

TrainJob and RayJob access permission tests fail intermittently on first attempt because OpenShift RBAC cache hasn't propagated the new role binding yet. Added waitForUserProjectAccess util that polls until the permission is effective before re-logging in.

How Has This Been Tested?

Both specs pass with CY_RETRY=0 on a live RHOAI cluster. Lint and type-check pass.

Screenshot 2026-04-23 at 6 44 02 PM Screenshot 2026-04-23 at 6 47 06 PM

Test Impact

Fixes flakiness in existing e2e tests. No new tests needed.

Request review criteria:

Self checklist (all need to be checked):

  • The developer has manually tested the changes and verified that the changes work
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has added tests or explained why testing cannot be added (unit or cypress tests for related changes)
  • The code follows our Best Practices (React coding standards, PatternFly usage, performance considerations)

If you have UI changes:

  • Included any necessary screenshots or gifs if it was a UI change.
  • Included tags to the UX team if it was a UI/UX change.

After the PR is posted & before it merges:

  • The developer has tested their solution on a cluster by using the image produced by the PR to main

Summary by CodeRabbit

  • Tests

    • Improved reliability of project access permission tests by waiting for RBAC changes to propagate before re-login and assertions.
    • Added a utility to verify when a user can see a project, improving synchronization in permission-related flows.
  • Chores

    • Adjusted deletion behavior for certain test-managed resources so cleanup proceeds without waiting for termination.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 24, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign emilys314 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 24, 2026

📝 Walkthrough

Walkthrough

Four Cypress files changed. Two E2E test specs now wait for RBAC propagation by invoking a new helper before re-login and UI assertions. A new exported utility waitForUserProjectAccess(project, user, attempts?, interval?) was added; it polls oc get project <project> --as=<user> until success with defaults of 15 attempts and 2000ms interval. One utility for deleting Kueue resources was modified to add --wait=false to certain oc delete invocations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Actionable Issues

  • Security — Impersonation / privilege escalation risk (CWE-264 / CWE-269): waitForUserProjectAccess depends on the test runner having impersonation rights (--as=<user>). Ensure the runner account only has scoped impersonation rights for the specific test identities and projects; remove any cluster-wide impersonation grants. If impersonation cannot be tightly scoped, avoid --as and use a dedicated short-lived test identity instead.

  • Security — Command injection via unsanitized CLI args (CWE-78): project and user are interpolated into shell commands. Validate and strictly whitelist allowed project and username patterns or use structured APIs (k8s client libraries) to avoid shell interpolation. Escape or pass arguments as parameters to the exec API rather than concatenating into a shell string.

  • Reliability — Ambiguous failure causes / poor error differentiation: Polling treats all failures identically, hiding whether the user/project is nonexistent vs. RBAC propagation delay. Make polling errors explicit: detect 404/nonexistent responses vs. permission-denied vs. transient network errors and surface distinct messages to aid debugging.

  • Robustness — Hardcoded retry defaults and environment variance: 15×2s (30s) may be insufficient on loaded CI. Make attempts/interval configurable via environment variables or test config; consider exponential backoff and a higher default ceiling for CI pipelines.

  • Operational — Use of --wait=false on resource deletes: Changing deletes to not wait may leave resources terminating asynchronously and cause race conditions in subsequent tests. Ensure callers account for eventual consistency (e.g., poll for absence when needed) or only use --wait=false where immediate continuation is safe.

  • Code quality — Lack of input validation and minimal logging: Add input validation, richer logging on each poll attempt (timestamp, attempt number, raw output), and surface the last exec output on final failure to accelerate triage.

CVE references are not directly applicable to these changes; referenced CWEs above indicate classes of issues to remediate.

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main change: adding RBAC propagation wait logic before project access assertions in Cypress tests.
Description check ✅ Passed Description covers the problem (intermittent test failures due to RBAC cache propagation), solution (waitForUserProjectAccess utility), testing evidence (screenshots, CY_RETRY=0 success), and completes the provided template checklist.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/cypress/cypress/utils/oc_commands/project.ts (1)

111-124: Unquoted interpolation into shell command — harden against CWE-78.

project and user are spliced straight into the cy.exec string. Today the callers pass UUID-generated project names and env-sourced usernames, so it's not exploitable, but nothing on the function signature prevents a future caller from passing a value containing shell metacharacters (;, `, $(...), spaces). Since this helper sits alongside other project helpers that will likely be reused, a bit of defensive hardening is cheap.

🛡️ Suggested defensive hardening
 export const waitForUserProjectAccess = (
   project: string,
   user: string,
   attempts = 15,
   interval = 2000,
-): Cypress.Chainable<Cypress.Exec> =>
-  pollUntilSuccess(
-    `oc get project ${project} --as=${user} -o name`,
-    `${user} access to ${project}`,
-    {
-      maxAttempts: attempts,
-      pollIntervalMs: interval,
-    },
-  );
+): Cypress.Chainable<Cypress.Exec> => {
+  const safe = /^[A-Za-z0-9._@-]+$/;
+  if (!safe.test(project) || !safe.test(user)) {
+    throw new Error(`Invalid project or user identifier passed to waitForUserProjectAccess`);
+  }
+  return pollUntilSuccess(
+    `oc get project '${project}' --as='${user}' -o name`,
+    `${user} access to ${project}`,
+    { maxAttempts: attempts, pollIntervalMs: interval },
+  );
+};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/cypress/cypress/utils/oc_commands/project.ts` around lines 111 -
124, The command in waitForUserProjectAccess injects unescaped project and user
into a shell string (CWE-78); fix by escaping shell metacharacters and wrapping
each interpolation in single quotes before passing to pollUntilSuccess (e.g.,
implement a small helper like escapeShellArg that safely escapes any single
quotes in its input and returns the value wrapped in single quotes), then call
pollUntilSuccess with the sanitized `'${project}'` and `'${user}'` values in the
`oc get project ... --as=...` string so no raw user-controlled characters reach
the shell.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/cypress/cypress/utils/oc_commands/project.ts`:
- Around line 111-124: The command in waitForUserProjectAccess injects unescaped
project and user into a shell string (CWE-78); fix by escaping shell
metacharacters and wrapping each interpolation in single quotes before passing
to pollUntilSuccess (e.g., implement a small helper like escapeShellArg that
safely escapes any single quotes in its input and returns the value wrapped in
single quotes), then call pollUntilSuccess with the sanitized `'${project}'` and
`'${user}'` values in the `oc get project ... --as=...` string so no raw
user-controlled characters reach the shell.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 1df5c5b4-9279-468d-9a2f-5e7c84ddf953

📥 Commits

Reviewing files that changed from the base of the PR and between e66d0a2 and b6586ad.

📒 Files selected for processing (3)
  • packages/cypress/cypress/tests/e2e/modelTraining/rayJobs/testRayJobProjectAccessPermissions.cy.ts
  • packages/cypress/cypress/tests/e2e/modelTraining/trainJobs/testProjectAccessPermissions.cy.ts
  • packages/cypress/cypress/utils/oc_commands/project.ts

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.99%. Comparing base (427b9b4) to head (c8e870c).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #7381      +/-   ##
==========================================
- Coverage   65.04%   63.99%   -1.06%     
==========================================
  Files        2458     2519      +61     
  Lines       76443    78049    +1606     
  Branches    19289    19853     +564     
==========================================
+ Hits        49726    49948     +222     
- Misses      26717    28101    +1384     

see 100 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 427b9b4...c8e870c. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/cypress/cypress/utils/oc_commands/distributedWorkloads.ts`:
- Line 71: The command construction used by cy.exec is vulnerable to shell
injection because localQueueName, clusterQueueName, resourceFlavor, and
projectName are interpolated directly; update the code that builds commands (the
cy.exec calls constructing strings like `oc delete LocalQueue ${localQueueName}
-n ${projectName} --wait=false${ignoreFlag}`) to either validate each identifier
against an RFC1123 label regex (e.g., /^[a-z0-9-]+$/) and reject or throw on
mismatch, or stop using direct string interpolation and instead delete by safe
means (use oc delete with --selector or pass a JSON/YAML manifest via stdin or a
file) so untrusted chars never reach a shell; ensure the validation/alternative
logic is applied consistently to localQueueName, clusterQueueName,
resourceFlavor, and projectName before executing cy.exec.
- Around line 70-74: The deleteKueueResources helper builds ocCommand with
--wait=false for all resources, which causes flakiness for cluster-scoped
ClusterQueue and ResourceFlavor when static names are reused; update the
ocCommand construction in distributedWorkloads.ts (the deleteKueueResources call
site building ocCommand) so that ClusterQueue and ResourceFlavor use the default
wait behavior (remove the --wait=false flag or explicitly use --wait=true) or
implement a short poll after deletion to confirm removal before returning;
alternatively, if you prefer fire-and-forget, make callers that use static names
(e.g., testWorkloadMetricsDefaultPageContents.cy.ts using
testData.clusterQueue/testData.resourceFlavour) append a UUID suffix to those
names to guarantee uniqueness.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: ee723374-f04e-4f98-b172-43322d530ad7

📥 Commits

Reviewing files that changed from the base of the PR and between b6586ad and c8e870c.

📒 Files selected for processing (1)
  • packages/cypress/cypress/utils/oc_commands/distributedWorkloads.ts

Comment thread packages/cypress/cypress/utils/oc_commands/distributedWorkloads.ts
Comment thread packages/cypress/cypress/utils/oc_commands/distributedWorkloads.ts
@PR3MM
Copy link
Copy Markdown
Contributor

PR3MM commented Apr 24, 2026

As you have updated the PR can you also update the description?

@PR3MM
Copy link
Copy Markdown
Contributor

PR3MM commented Apr 24, 2026

Ran trainJobs/testProjectAccessPermissions.cy.ts and rayJobs/testRayJobProjectAccessPermissions.cy.ts on RHOAI E2E cluster, all tests green

image

@ConorOM1
Copy link
Copy Markdown
Contributor

@PR3MM please share id of the jenkins job. TY !

@PR3MM
Copy link
Copy Markdown
Contributor

PR3MM commented Apr 24, 2026

I tested it locally @ConorOM1 , @kanishka-commits can you please run on jenkins

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants