Replies: 1 comment
-
|
See https://etcd.io/blog/2025/zombie_members_upgrade/ which supercedes the page you linked. You need to be on a newer version of etcd 3.5 before upgrading to 3.6. That said, I'm not sure what you mean by losing traffic or failing tests. Neither of those sound like they're related to etcd. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hej All
Recently we started upgrading our internal test system from 1.33.2 to 1.33.10
We have selected the 1.33.10 due to Zombie Cluster Members issue.
After a week we have identified two problems:
The reason for extensive automated system failure is due to our use case.
We install small 3 node k8s clusters at our clients in airgap env.
And we need to test mainly our application that is auto recovers in all scenarios.
In production at our clients, we never scale system up or down.
We get 3 VMs and this are static; this made me wonder if the Zombie Cluster Members bug is really relevant for us ....
Before the final blog was posted there was also:
https://etcd.io/blog/2025/upgrade_from_3.5_to_3.6_issue_followup/
rke2 1.33.2 does have etcd v3.5.21-k3s1, which to the above post is good enough to go to etcd 3.6
Our test systems are subjected to a lot of failover testing like 100 times a week.
Maybe there is some old bad data that is making these problems?
We did notice that when we do a lot of this failover test, quate some old kube-system container logs are left in /var/lib
Did anybody face similar issues?
Best Uros
Beta Was this translation helpful? Give feedback.
All reactions