Skip to content

Commit b779d0e

Browse files
committed
Merge branch 'bugfix/fix-flaky-on-apiserver-restart' into tmp/octopus/w/131.0/bugfix/fix-flaky-on-apiserver-restart
2 parents d6b6868 + b78356d commit b779d0e

File tree

7 files changed

+24
-7
lines changed

7 files changed

+24
-7
lines changed

.github/actions/sosreport-logs/action.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,11 @@ runs:
2424
run: |
2525
for host in ${HOSTS_LIST}; do
2626
ssh -F ssh_config ${host} \
27-
"sudo sosreport --all-logs \
27+
"sudo sos report --all-logs \
2828
-o metalk8s -kmetalk8s.k8s-resources -kmetalk8s.pod-logs -k metalk8s.describe -k metalk8s.metrics \
2929
-o metalk8s_containerd -kmetalk8s_containerd.all -kmetalk8s_containerd.logs \
3030
--batch --tmp-dir /var/tmp && \
31-
sudo chown -R 1000:1000 /var/tmp/sosreport*"
31+
sudo chown 1000:1000 /var/tmp/sosreport*"
3232
done
3333
- name: collect sosreport on each node
3434
shell: bash

.github/scripts/stabilize_snapshot.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,7 @@ def check_pods_running():
205205
"--all",
206206
"--all-namespaces",
207207
"--for=condition=Ready",
208-
"--timeout=10s",
208+
"--timeout=60s",
209209
"--selector=!job-name", # We filter out Jobs (they can't be Ready)
210210
capture_output=True,
211211
check=True,

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,11 @@
99
- Add 1 second request interval on every salt call using http.wait_for_successful_query
1010
(PR[#4609](https://github.com/scality/metalk8s/pull/4609))
1111

12+
### Bug fixes
13+
14+
- Make sure the apiserver is running after reconfiguring the pod
15+
(PR[#4611](https://github.com/scality/metalk8s/pull/4611))
16+
1217
## Release 130.0.0
1318

1419
### Enhancements

docs/operation/cluster_monitoring/prometheus.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ To generate a snapshot, use the
2020

2121
.. code-block:: console
2222
23-
root@host # sosreport --batch --build -o metalk8s -kmetalk8s.prometheus-snapshot=True
23+
root@host # sos report --batch --build -o metalk8s -kmetalk8s.prometheus-snapshot=True
2424
2525
The name of the generated archive is printed on the console output and
2626
the Prometheus snapshot can be found under ``prometheus_snapshot`` directory.

docs/operation/sosreport.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ To include logs and configuration for containerd and MetalK8s components, run:
1919

2020
.. code-block:: console
2121
22-
root@your-machine # sosreport --batch --all-logs \
22+
root@your-machine # sos report --batch --all-logs \
2323
-o metalk8s -kmetalk8s.k8s-resources -kmetalk8s.pod-logs -kmetalk8s.describe -kmetalk8s.metrics \
2424
-o metalk8s_containerd -kmetalk8s_containerd.all -kmetalk8s_containerd.logs
2525
@@ -37,4 +37,4 @@ To display the full list of available plugins and their options, run:
3737

3838
.. code-block:: shell
3939
40-
sosreport --list-plugins
40+
sos report --list-plugins

salt/metalk8s/kubernetes/apiserver/installed.sls

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -163,16 +163,27 @@ Restart kubelet to make it pick up the manifest changes:
163163
timeout: 120
164164
raise_on_timeout: false
165165
- require_in:
166-
- module: Make sure kube-apiserver container is up and ready
166+
- module: Delay after kube-apiserver Pod deployment
167167
168168
{%- endif %}
169169
170+
# This is a bit ugly but currently we have no easy way to find if the
171+
# pod has been updated already or not, so we need to wait a bit
172+
Delay after kube-apiserver Pod deployment:
173+
module.wait:
174+
- test.sleep:
175+
- length: 10
176+
- watch:
177+
- metalk8s: Create kube-apiserver Pod manifest
178+
170179
Make sure kube-apiserver container is up and ready:
171180
module.run:
172181
- cri.wait_container:
173182
- name: kube-apiserver
174183
- state: running
175184
- timeout: 120
185+
- require:
186+
- module: Delay after kube-apiserver Pod deployment
176187
- onchanges:
177188
- metalk8s: Create kube-apiserver Pod manifest
178189
http.wait_for_successful_query:

salt/metalk8s/orchestrate/apiserver.sls

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ Deploy apiserver {{ node }} to {{ dest_version }}:
3838
- tgt: {{ node }}
3939
- sls:
4040
- metalk8s.kubernetes.apiserver
41+
- metalk8s.kubernetes.apiserver-proxy
4142
- saltenv: metalk8s-{{ dest_version }}
4243
- require:
4344
- salt: Check pillar on {{ node }}

0 commit comments

Comments
 (0)