You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the operator initiates a failover, the old primary's pod keeps its
`cnpg.io/instanceRole=primary` label until `ReconcileMetadata` runs. But
`ReconcileMetadata` is skipped during the entire failover window (the
`CurrentPrimary != TargetPrimary` guard returns early), so the `-rw`
service keeps routing to the old primary. If the old primary comes back
(e.g. after a temporary network partition), replicas reconnect through
the `-rw` service, satisfy the sync quorum, and writes committed on the
stale primary are lost to `pg_rewind`.
Introduce a third value for the instance role label, `unhealthy`, and
apply it to the old primary as soon as failover starts. Since neither
the `-rw` nor the `-ro` service selector matches `unhealthy`, the pod is
immediately isolated from all service traffic for the duration of the
failover window. `ReconcileMetadata` restores the `replica` label once
`CurrentPrimary == TargetPrimary`.
The label is applied best-effort at the point where failover is
initiated, and re-applied on every pass of the reconcile loop while the
failover is in progress so transient API errors are retried
automatically.
Note: stripping the label removes the pod from the service Endpoints,
but does not drop TCP connections already established by a replica's
walreceiver. This fix closes the reconnection window; established
connections must still be terminated by the Postgres-level promotion on
the new primary.
Closescloudnative-pg#10403
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com>
Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
0 commit comments