Implement fast predicate index for cluster-autoscaler simulator by x13n · Pull Request #9461 · kubernetes/autoscaler

x13n · 2026-04-08T08:57:40Z

This change introduces a fast predicate index and specialized fast predicates in the cluster snapshot simulator. This significantly optimizes pod scheduling simulations by avoiding redundant predicate evaluations and utilizing efficient indexing for node filtering, particularly for pod affinity/anti-affinity and topology spread constraints.

Key improvements:

Introduced FastPredicateIndex to track pod counts by labels and topology domains.
Implemented FastPredicates to perform preliminary, optimized checks before falling back to the full scheduler plugin runner.
Integrated the index with Basic and Delta snapshot stores.
Added the 'fast-predicates-enabled' flag to control the feature.

Performance Impact (BenchmarkRunFiltersUntilPassingNode): The benchmarks show a significant performance improvement (6x to 11x) across different parallelism levels, with a substantial reduction in memory allocations.

Parallelism	Before (ns/op)	After (ns/op)	Improvement
1	3,910,850	630,607	6.2x
2	3,324,178	399,312	8.3x
4	2,834,906	285,971	9.9x
8	2,856,542	256,432	11.1x
16	3,026,452	278,924	10.8x

Memory Statistics (Parallelism 1):

Before: 1,508,666 B/op, 7045 allocs/op
After: 539,304 B/op, 3312 allocs/op

What type of PR is this?

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Major part of this PR is AI generated, needs careful review.

Does this PR introduce a user-facing change?

[Perf] A new fast-predicates-enabled flag can be used to replace slow scheduler predicate checking of anti-affinity and topology spreading with a faster CA-specific alternative.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

/hold for testing

This change introduces a fast predicate index and specialized fast predicates in the cluster snapshot simulator. This significantly optimizes pod scheduling simulations by avoiding redundant predicate evaluations and utilizing efficient indexing for node filtering, particularly for pod affinity/anti-affinity and topology spread constraints. Key improvements: - Introduced FastPredicateIndex to track pod counts by labels and topology domains. - Implemented FastPredicates to perform preliminary, optimized checks before falling back to the full scheduler plugin runner. - Integrated the index with Basic and Delta snapshot stores. - Added the 'fast-predicates-enabled' flag to control the feature. Performance Impact (BenchmarkRunFiltersUntilPassingNode): The benchmarks show a significant performance improvement (6x to 11x) across different parallelism levels, with a substantial reduction in memory allocations. Parallelism | Before (ns/op) | After (ns/op) | Improvement ------------|----------------|---------------|------------ 1 | 3,910,850 | 630,607 | 6.2x 2 | 3,324,178 | 399,312 | 8.3x 4 | 2,834,906 | 285,971 | 9.9x 8 | 2,856,542 | 256,432 | 11.1x 16 | 3,026,452 | 278,924 | 10.8x Memory Statistics (Parallelism 1): - Before: 1,508,666 B/op, 7045 allocs/op - After: 539,304 B/op, 3312 allocs/op

k8s-ci-robot · 2026-04-08T08:57:44Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot · 2026-04-08T08:57:52Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~cluster-autoscaler/OWNERS~~ [x13n]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tetianakh · 2026-04-08T11:20:28Z

cluster-autoscaler/simulator/clustersnapshot/predicate/fast_predicates.go

+		}
+	}
+
+	if affinity.PodAntiAffinity != nil {


At which point do we check if existing pods have anti-affinity against incoming pod? Incoming pod may have no AA in the spec, but we still need to check if it violates constraints of the existing pods.

tetianakh · 2026-04-08T11:24:07Z

cluster-autoscaler/simulator/clustersnapshot/predicate/plugin_runner.go

+		if p.fastPredicatesEnabled {
+			if err := p.fastCheckPredicates(pod, nodeInfo, fastState); err != nil {
+				// Fast check failed, so this Node won't work.
+				return


Do I understand correctly that error means "cannot be scheduled on this node"? This is quite confusing, I'd prefer it to return a boolean and use errors for actual errors

tetianakh · 2026-04-08T11:26:01Z

cluster-autoscaler/simulator/clustersnapshot/predicate/plugin_runner.go


-	workqueue.ParallelizeUntil(ctx, p.parallelism, len(nodeInfosList), checkNode)
+	chunkSize := chunkSizeFor(len(nodeInfosList), p.parallelism)
+	workqueue.ParallelizeUntil(ctx, p.parallelism, len(nodeInfosList), checkNode, workqueue.WithChunkSize(chunkSize))


I guess we should also disable the inter-pod affinity plugin?

tetianakh · 2026-04-08T11:42:54Z

cluster-autoscaler/simulator/clustersnapshot/store/fast_predicate_index.go

Please add a comment that documents how this works

k8s-ci-robot requested review from aleksandra-malinowska and elmiko April 8, 2026 08:57

k8s-ci-robot removed the do-not-merge/needs-area label Apr 8, 2026

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 8, 2026

x13n added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 8, 2026

tetianakh reviewed Apr 8, 2026

View reviewed changes

x13n mentioned this pull request Apr 10, 2026

[Scheduling] Add a Caching Mechanism for InterPodAffinity and PodTopologySpread Plugins kubernetes/kubernetes#137654

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement fast predicate index for cluster-autoscaler simulator#9461

Implement fast predicate index for cluster-autoscaler simulator#9461
x13n wants to merge 1 commit intokubernetes:masterfrom
x13n:master

x13n commented Apr 8, 2026 •

edited

Loading

Uh oh!

k8s-ci-robot commented Apr 8, 2026

Uh oh!

k8s-ci-robot commented Apr 8, 2026

Uh oh!

tetianakh Apr 8, 2026

Uh oh!

tetianakh Apr 8, 2026

Uh oh!

tetianakh Apr 8, 2026

Uh oh!

tetianakh Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

x13n commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

k8s-ci-robot commented Apr 8, 2026

Uh oh!

k8s-ci-robot commented Apr 8, 2026

Uh oh!

tetianakh Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

tetianakh Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

tetianakh Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

tetianakh Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

x13n commented Apr 8, 2026 •

edited

Loading