Skip to content

Conversation

@Michal-Leszczynski
Copy link
Collaborator

Cluster.KnownHosts are not part of REST API definitions, so we don't set them when running parseCluster.
It resulted in passing cluster with uninitialized
known hosts to ClusterService.PutCluster and in some cases to overwriting previously correctly discovered known hosts. This in turn could lead to connectivity issues as described in #4733.

To fix that, this commit fills known hosts of the cluster parsed from REST API before passing it to ClusterService.PutCluster. The only other place where parseCluster is used is on cluster creation, but we don't have any discovered known hosts at this point, so there is no need to fill them.

In terms of testing, this commit extended test checking connectivity check on cluster update. The problem is that since the fix targeted just the API handler, it's difficult to test it reliably without mocks. In order to bypass, additional safety net which fills missing known hosts in ClusterService.PutCluster was added. With that, we can validate that passing nil Cluster.KnowHosts to
ClusterService.PutCluster results in not losing the already discovered known hosts. So even if the initial API handler fix does not work, this safety net should correct it.

Fixes #4733

Cluster.KnownHosts are not part of REST API definitions,
so we don't set them when running parseCluster.
It resulted in passing cluster with uninitialized
known hosts to ClusterService.PutCluster and in some
cases to overwriting previously correctly discovered known
hosts. This in turn could lead to connectivity issues
as described in #4733.

To fix that, this commit fills known hosts of the cluster
parsed from REST API before passing it to ClusterService.PutCluster.
The only other place where parseCluster is used is on cluster
creation, but we don't have any discovered known hosts at this point,
so there is no need to fill them.

In terms of testing, this commit extended test checking
connectivity check on cluster update. The problem is that
since the fix targeted just the API handler, it's difficult
to test it reliably without mocks. In order to bypass,
additional safety net which fills missing known hosts in
ClusterService.PutCluster was added. With that, we can
validate that passing nil Cluster.KnowHosts to
ClusterService.PutCluster results in not losing the already
discovered known hosts. So even if the initial API handler
fix does not work, this safety net should correct it.

Fixes #4733
@Michal-Leszczynski Michal-Leszczynski force-pushed the ml/4733-dont-nuke-known-hosts branch from 6c7507c to e5f1ea2 Compare January 15, 2026 15:30
@Michal-Leszczynski Michal-Leszczynski marked this pull request as ready for review January 15, 2026 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SM clears cluser known hosts on cluster update

2 participants