Skip to content

[BUG] helm uninstall/install w/o deleting of PVCs fail with "join validation on cluster state" error #965

@maxlepikhin

Description

@maxlepikhin

What is the bug?

Recovery after helm uninstall/install w/o removing opensearch PVCs does not work. With or without manager.parallelRecoveryEnabled = false, the nodes keep logging below error w/o ever recovering:

Caused by: org.opensearch.transport.RemoteTransportException: [opensearch-cluster-nodes-2][10.244.2.152:9300][internal:cluster/coordination/join/validate_compressed]
Caused by: org.opensearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid prI8Q7O2R4arkddL88Pyig than local cluster uuid JQ7q5TNkQSWPyeWbXw4ZOQ, rejecting

I am guessing there is a disconnect between how/when security config updater creates/updates security config and when nodes pick it up from storage.

How can one reproduce the bug?

  1. Install OpenSearchCluster with 3 replicas, wait for green status of the cluster.
  2. Helm uninstall. Keep PVCs, do not delete them.
  3. Helm install. Observe the error.

What is the expected behavior?

The cluster recovers successfully from the existing PVCs.

What is your host/environment?

Ubuntu 24.04

Do you have any screenshots?

N/A

Do you have any additional context?

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    🆕 New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions