[v1] Inefficient operation order when scaleOut and rolling update happen simultaneously: new pods are created with old spec

## Summary

In v1, when a user simultaneously increases replica count (scaleOut) and changes
configuration/resources (scaleUp/Down), the operator:
1. Creates all new pods with the **old spec** (e.g., 8vCPU)
2. Only after scaling completes, rolls **all pods** (old + new) to the new spec

This means in the combined scaleOut+scaleUp scenario, more pods need rolling
updates than necessary.

`scaleIn` combined with rolling update is not affected — scaleIn-first is the correct and expected behavior.

## Affected Component

`pkg/manager/member/tidb_member_manager.go`
`pkg/manager/member/tidb_upgrader.go`

## Root Cause

In `syncTiDBStatefulSetForTidbCluster`, Scale runs before Upgrade (by explicit
design — a comment in the code states "Scaling takes precedence over upgrading").

While scaling is in progress (`tc.TiDBScaling()`), the upgrader resets the
StatefulSet pod template back to the old spec (via `GetLastAppliedConfig`).
The StatefulSet is then applied with increased replicas + old pod template,
so Kubernetes creates new pods using the **old spec** (e.g., 8vCPU).

## Concrete Example (10 → 30 nodes, 8vCPU → 16vCPU)

| Step | Cluster State | Action |
|------|--------------|--------|
| ScaleOut (parallelism=1 default) | 10 → 30 pods, all **8vCPU** | Upgrader blocked, old spec used for new pods |
| Rolling update × 30 | 8vCPU → 16vCPU, 1 per cycle | All 30 pods (incl. 20 newly created) need updating |

**Total rolling updates: 30** — all pods including the 20 newly created ones.

## Expected Behavior

ScaleOut should create new pods with the new spec, and only the original pods
should need rolling update afterward:

| Step | Cluster State | Action |
|------|--------------|--------|
| ScaleOut | 10 old (8vCPU) + 20 new (**16vCPU**) | New pods created with new spec |
| Rolling update × 10 | 8vCPU → 16vCPU, 1 per cycle | Only original pods need updating |

**Total rolling updates: 10** — only the original pods.

Using a simpler example (3 → 10 nodes, 8vCPU → 16vCPU):

1. ScaleOut 7 new servers with **new spec (16vCPU)**
2. Rolling update 3 old servers from 8vCPU → 16vCPU

Instead of the current behavior where all 10 servers are updated.

## Note: This is a backfill from v2

v2 already implements the expected behavior — ScaleOut always creates new pods
with the current desired spec and revision. This issue requests bringing the
same behavior to v1.

## Related

- v2: https://github.com/pingcap/tidb-operator/issues/6757

## Environment

- Version: v1 (`pkg/manager/member`)
- Applies to: TiDB (same pattern exists for TiKV, TiFlash, PD)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v1] Inefficient operation order when scaleOut and rolling update happen simultaneously: new pods are created with old spec #6769

Summary

Affected Component

Root Cause

Concrete Example (10 → 30 nodes, 8vCPU → 16vCPU)

Expected Behavior

Note: This is a backfill from v2

Related

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Step	Cluster State	Action
ScaleOut (parallelism=1 default)	10 → 30 pods, all 8vCPU	Upgrader blocked, old spec used for new pods
Rolling update × 30	8vCPU → 16vCPU, 1 per cycle	All 30 pods (incl. 20 newly created) need updating

Step	Cluster State	Action
ScaleOut	10 old (8vCPU) + 20 new (16vCPU)	New pods created with new spec
Rolling update × 10	8vCPU → 16vCPU, 1 per cycle	Only original pods need updating

[v1] Inefficient operation order when scaleOut and rolling update happen simultaneously: new pods are created with old spec #6769

Description

Summary

Affected Component

Root Cause

Concrete Example (10 → 30 nodes, 8vCPU → 16vCPU)

Expected Behavior

Note: This is a backfill from v2

Related

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions