Skip to content

Commit 2392326

Browse files
committed
ci(openshift): pin router-default to control plane during E2E (cloudnative-pg#10734)
The drain_node E2E cordons one of three labelled worker nodes and drains another, leaving the router-default Deployment's PDB unable to evict its replicas onto the single surviving worker; the drain hangs and the test fails intermittently on OpenShift. Patch IngressController.spec.nodePlacement before the suite runs so router-default lives on the control plane for the cluster's lifetime, keeping all workers free for cordon and drain. The placement is non-production, but the OCP cluster is destroyed at end-of-job. Waits use observedGeneration plus rollout status, no fixed sleep. Verified on an OCP 4.21 run of the equivalent change in the downstream pg4k repo: the drain_node "3 pods on 2 nodes" spec (which had been failing on the OCP nightly) is now green. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> (cherry picked from commit d0e5078)
1 parent 999926e commit 2392326

1 file changed

Lines changed: 34 additions & 0 deletions

File tree

hack/e2e/run-e2e-ocp.sh

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,5 +171,39 @@ while true; do
171171
break
172172
done
173173

174+
# Move OCP ingress routers off worker nodes for the duration of this test
175+
# job, so the drain_node e2e does not strand router-default's PDB. The
176+
# placement is non-production but the OCP cluster is destroyed at end-of-job.
177+
# OCP through 4.21 still applies the legacy control-plane taint; the key
178+
# is kept in a shell variable so the JSON literal does not contain a woke
179+
# trigger word.
180+
echo "Pinning router-default to control plane nodes"
181+
LEGACY_TAINT_KEY="node-role.kubernetes.io/master" # wokeignore:rule=master
182+
CTRL_PLANE_KEY="node-role.kubernetes.io/control-plane"
183+
PATCH=$(cat <<EOF
184+
{
185+
"spec": {
186+
"nodePlacement": {
187+
"nodeSelector": { "matchLabels": { "${CTRL_PLANE_KEY}": "" } },
188+
"tolerations": [
189+
{ "key": "${LEGACY_TAINT_KEY}", "operator": "Exists", "effect": "NoSchedule" },
190+
{ "key": "${CTRL_PLANE_KEY}", "operator": "Exists", "effect": "NoSchedule" }
191+
]
192+
}
193+
}
194+
}
195+
EOF
196+
)
197+
NEW_GEN=$(oc patch ingresscontroller.operator.openshift.io/default \
198+
-n openshift-ingress-operator --type merge -p "$PATCH" \
199+
-o jsonpath='{.metadata.generation}')
200+
oc wait ingresscontroller.operator.openshift.io/default \
201+
-n openshift-ingress-operator \
202+
--for=jsonpath="{.status.observedGeneration}=${NEW_GEN}" --timeout=2m
203+
oc rollout status -n openshift-ingress deployment/router-default --timeout=5m
204+
oc wait ingresscontroller.operator.openshift.io/default \
205+
-n openshift-ingress-operator \
206+
--for=condition=Available=True --timeout=2m
207+
174208
echo "Running the e2e tests"
175209
"${ROOT_DIR}/hack/e2e/run-e2e.sh"

0 commit comments

Comments
 (0)