Skip to content

SM-8.10 nightly (qadocupg upgrade-minor): demo user denied on orchestration cluster after upgrade — invalid token issuer 500 + Tasklist 'no access' #6351

@esraagamal6

Description

@esraagamal6

Summary

SM-8.10 nightly Document Store + upgrade-minor scenario (qadocupg) fails because the default demo user is denied authorization across all orchestration-cluster components after the 8.9→8.10 minor upgrade. This is a deploy/product-layer OIDC issuer + authorization breakage, not an E2E test bug — no test-side change can fix it without masking the failure.

Originating nightly run: https://github.com/camunda/c8-cross-component-e2e-tests/actions/runs/27117857288

Failing tests (both in tests/SM-8.10/document-handling-user-flows.spec.ts)

  1. Document Handling HTO User Flow - AWS — at /orchestration/tasklist, Tasklist renders "You don't have access to this component", so taskPanelPage.openTask(...) times out after 10 reload attempts.
  2. Document Handling Connectors User Flow - AWS @tasklistV2/orchestration/admin redirects to /admin/forbidden and the backend returns:
    {"type":"about:blank","title":"Internal Server Error","status":500,
     "detail":"[invalid_grant] Invalid token issuer. Expected 'http://keycloak.qa-id-qadocupg-46ee-8-9-qadocupg.svc.cluster.local/auth/realms/camunda-platform'",
     "instance":"/orchestration/admin"}
    The Roles tab never renders even with the existing 300s re-navigation retry (added in c8-cross-component-e2e-tests commit ca5c83a2).

Root cause (from nightly traces)

The orchestration cluster's resource server expects the token issuer to be the internal keycloak service URL (http://keycloak.<ns>.svc.cluster.local/auth/realms/camunda-platform), but browser-flow tokens minted via the external ingress carry a different iss. The mismatch yields a hard invalid_grant 500 on every orchestration-cluster request, which surfaces as "no access" (Tasklist) and /admin/forbidden (Admin).

Trace evidence:

  • 10 fresh Keycloak logins over the full 300s retry window all land on /admin/forbidden (96 redirects observed) — confirming this is a persistent config state, not a propagation delay.
  • Web Modeler (separate auth path) works; all orchestration-cluster components (Tasklist, Operate, Admin) are denied.
  • The demo user is the default admin, so the E2E test setup is correct and requires no special role assignment.

Why this is not fixable in the E2E test repo

  • Post-upgrade role re-assignment (the usual Step 3a remediation) requires Admin access, which is itself blocked by the same 500.
  • Re-login / cookie-clear cannot change a server-side token-issuer expectation (10 fresh logins already failed).
  • demo is the highest-privileged default user; there is no alternative credential.

Likely fix area: orchestration-cluster OIDC issuer configuration for the qadocupg (Document Store) upgrade-minor flow in the Helm deploy — ensure the expected issuerBackendUrl / keycloak frontend URL matches the issuer of browser-flow tokens after a minor upgrade.

Scope

tests/SM-8.10/document-handling-user-flows.spec.ts is gated on IS_DS=true; only this qadocupg upgrade-minor scenario is affected. The SM-8.9 sibling spec does not carry the admin-retry workaround, indicating this regression is 8.10-upgrade specific.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething isn't working as intendedlikelihood/midObserved occasionallyseverity/highMarks a bug as having a noticeable impact on the user with no known workaroundtriage:completed

    Type

    No type

    Urgency

    next

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions