Skip to content

Conversation

@AndreasHolt
Copy link
Contributor

What changed?

  • switch assignEphemeralShard to call a new pickLeastLoadedExecutor helper that sums smoothed shard load per executor (falling back to shard count) and log the selected target
  • cover the helper and the new load-based behavior in handler_test.go to make sure we pick the least-loaded executor and handle empty states
  • add AggregateLoad and AssignedCount tags so handler logs can show load totals when placing ephemeral shards

Why?

  • Initial placement previously picked the executor with the fewest assigned shards. Using the smoothed per-shard load lets us balance based on actual work.
  • Logging aggregated load and assignment count for every placement call gives us observability when verifying decisions.

How did you test it?

  • Added unit tests TestPickLeastLoadedExecutor and ShardNotFound_Ephemeral_LoadBased in handler_test.go to verify logic.
  • Verified that when loads are equal (or zero), the logic correctly falls back to the fewest assigned shards.

Potential risks
If shard stats are missing or stale, the aggregated load will calculate as zero. In this case, the logic degrades to the previous behavior (selecting based on shard count), minimizing the risk of bad placement.

Release notes
Ephemeral shard placement now favors the executor with the lowest smoothed load (with shard-count tie breaker) and logs the inputs for each decision.

Documentation Changes

AndreasHolt and others added 30 commits October 20, 2025 14:05
… is being reassigned in AssignShard

Signed-off-by: Andreas Holt <[email protected]>
…to not overload etcd's 128 max ops per txn

Signed-off-by: Andreas Holt <[email protected]>
…s txn and retry monotonically

Signed-off-by: Andreas Holt <[email protected]>
…shard metrics, move out to staging to separate function

Signed-off-by: Andreas Holt <[email protected]>
… And more idiomatic naming of collection vs singular type

Signed-off-by: Andreas Holt <[email protected]>
…ook more like executor key tests

Signed-off-by: Andreas Holt <[email protected]>
…ey in BuildShardKey, as we don't use it

Signed-off-by: Andreas Holt <[email protected]>
…e with new load based selection

Signed-off-by: Andreas Holt <[email protected]>
AndreasHolt and others added 30 commits December 4, 2025 11:43
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
…istics' of github.com:AndreasHolt/cadence into heartbeat-shard-statistics
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants