Skip to content

more fixes for orchestrator script#81

Merged
subhashkhileri merged 1 commit into
redhat-developer:mainfrom
rostalan:install-orch-fix
Apr 14, 2026
Merged

more fixes for orchestrator script#81
subhashkhileri merged 1 commit into
redhat-developer:mainfrom
rostalan:install-orch-fix

Conversation

@rostalan

@rostalan rostalan commented Apr 9, 2026

Copy link
Copy Markdown
Contributor
  • replaced check_operator_status with wait_for_operator using OLM label selectors for deterministic operator selection
  • handle 409 race in createNamespaceIfNotExists

@rostalan rostalan force-pushed the install-orch-fix branch 3 times, most recently from a6c7ea9 to 1b762a2 Compare April 9, 2026 14:30
Comment thread src/utils/kubernetes-client.ts Outdated
createError.message.includes("409")
) {
console.log(`✓ Namespace ${namespace} already exists`);
return await this._k8sApi.readNamespace({ name: namespace });

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 409 handling itself looks fine as a defensive measure, but I'm not sure the race condition scenario described actually happens in practice. createNamespaceIfNotExists already does a read-before-create, and the callers within this package don't seem to race configure() and deploy() run sequentially. Different Playwright workers get different project names so they'd be creating different namespaces too.

Can you share what failure you actually hit that led to this? If it was a flaky CI run, it might have been something else (retry after timeout, namespace stuck in Terminating, etc.) rather than a true parallel create race.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original logs expired, but it was a real CI failure on PR #2052 (2026-04-08T12:34:12Z). The orchestrator workspace has one project with two spec files, but playwright runs them on 2 separate workers. Both workers initialize concurrently, each calls the createNamespaceIfNotExists("orchestrator"), however then there is a race and the the slower worker fails with 409:

Running 52 tests using 2 workers
✓ Created namespace orchestrator
✗ Failed to create namespace orchestrator: HTTP-Code: 409
Error: HTTP-Code: 409
Body: {"status":"Failure","message":"namespaces \"orchestrator\" already exists",
       "reason":"AlreadyExists","code":409}
    at CoreV1ApiResponseProcessor.createNamespaceWithHttpInfo
    at KubernetesClientHelper.createNamespaceIfNotExists
    at RHDHDeployment.configure

Will try to find the full log in cursor convo...

@rostalan rostalan Apr 10, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant section:

Global setup completed successfully
Running 52 tests using 2 workers
✓ Created namespace orchestrator
✗ Failed to create namespace orchestrator: HTTP-Code: 409
Message: Unknown API Status Code!
Body: "{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"namespaces \\\"orchestrator\\\" already exists\",\"reason\":\"AlreadyExists\",\"details\":{\"name\":\"orchestrator\",\"kind\":\"namespaces\"},\"code\":409}\n"
Headers: {"audit-id":"4844d4df-0a1d-41cd-9240-872f6fac29f1","cache-control":"no-cache, private","connection":"close","content-length":"214","content-type":"application/json","date":"Wed, 08 Apr 2026 13:05:50 GMT","strict-transport-security":"max-age=31536000; includeSubDomains; preload","x-kubernetes-pf-flowschema-uid":"c2e2b597-a15c-475b-b2e8-882bbf41352d","x-kubernetes-pf-prioritylevel-uid":"1edd6b33-5208-4fd1-8533-25509e5145b4"}

@subhashkhileri subhashkhileri Apr 10, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The orchestrator workspace has one project with two spec files

that's the issue. and its expected if you use rhdh fixture then a project should have single spec file.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, this would mean that the two current specs would be imported into a parent one, which would mean they would run in parallel? This would not work as the rbac roles and permissions would likely conflict. We would need to disable parallel execution on the project.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, at the end it would behave same as having everything in single spec.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, trying it now.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this Pr ready to merge?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the other fixes are still required afaik.

- Replace check_operator_status with OLM label-based wait_for_operator:
  spec.displayName varies across channels/versions, causing empty CSV
  matches and operator timeouts. Uses deterministic
  operators.coreos.com/<package>.<namespace> label selectors instead.
- Add startingCSV pinning to logic-operator.v1.37.2: Ensures the
  subscription installs the exact OSL version instead of whatever
  stable channel resolves to.
- Add prepack lifecycle script: Ensures dist/ is built when the package
  is installed as a git dependency.

Made-with: Cursor
@subhashkhileri subhashkhileri merged commit 6da74af into redhat-developer:main Apr 14, 2026
3 checks passed
@rostalan rostalan deleted the install-orch-fix branch April 14, 2026 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants