Skip to content

[NodeResourceTopology] Nodes with != single-numa-node topology-policy are automatically filtered-in #885

Open
@Tal-or

Description

@Tal-or

Area

  • Scheduler
  • Controller
  • Helm Chart
  • Documents

Other components

No response

What happened?

When nodes gets to the Filter stage in the NodeResoruceTopology scheduler,
the scheduler checks the topology policy of the node.

If the topology is != single-numa-node the scheduler consider the node to be a good candidate for scheduling (filtered-in)
see:

It means that the node passed without any checks (skipped the scheduler logic) which is increasing the chances for a pod that is being scheduled on such node to hit TAE.

What did you expect to happen?

The logic should be reversed, any node that is not configured with single-numa-node policy should be filtered out (NOT in).

It makes more sense this way because a user that requests the pod to be scheduled with NodeResoruceTopology scheduler is expecting the pod to avoid TAE.

Reversing the logic will lean the balance towards pod get pending instead of hitting TAE.

How can we reproduce it (as minimally and precisely as possible)?

  1. Configure node with topology policy != single-numa-node
  2. Schedule pod on this node (using node-selector) and specify under the pod schedulerName that the pod should schedule using NodeResoruceTopology scheduler
  3. pod get scheduled and if there're not enough resources in a single numa it will hit TAE.

Anything else we need to know?

No response

Kubernetes version

Any version

Scheduler Plugins version

Any version

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions