[RFC] Advanced Core Type Selection #1917

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

dnmokhov wants to merge 1 commit into master from dev/dnmokhov/rfc-core-types

+285 −0

Contributor

dnmokhov commented Nov 24, 2025 •

edited

Loading

Add an RFC describing setting multiple core types in task arena constraints.

Reference implementation: dev/dnmokhov/core-types


          [RFC] Advanced Core Type Selection

d9af5f4

dnmokhov added this to the 2023.0.0 milestone

dnmokhov requested review from akukanov, aleksei-fedotov, isaevil, kboyarinov and vossmjp

November 24, 2025 15:48

dnmokhov added the RFC label

Contributor Author

dnmokhov commented Nov 24, 2025

@wangleis @sunxiaoxia2022, this is an RFC for adding multiple core type selection to the master branch. Feel free to provide feedback. Thanks!

wangleis reviewed

View reviewed changes

rfcs/proposed/core_types/README.md

+              | **P-cores only** | Maximum single-threaded performance | Leaves E-cores idle; limited parallelism; higher power |
+              | **E-cores only** | Good for parallel workloads | Doesn't utilize P-core performance; excludes LP E-cores |
+              | **LP E-cores only** | Minimal power consumption | Severe performance impact for most workloads |
+              | **No constraint** | Maximum flexibility | May schedule on inappropriate cores (e.g., LP E-cores for compute) |

wangleis Nov 25, 2025

P-cores + E-cores are required, but for latency mode with shared L3 cache, LPE cores should be avoided.

Contributor Author

dnmokhov Nov 25, 2025

Thanks for confirming. As it says in the following paragraph,

oneTBB/rfcs/proposed/core_types/README.md

Line 57 in d9af5f4

    
           None of these options provide the desired behavior: **"Use P-cores or E-cores, but avoid LP E-cores"** or **"Use any

aleksei-fedotov reviewed

View reviewed changes

rfcs/proposed/core_types/README.md

+                 - Pros: Simpler logic, easier to extend
+                 - Cons: Increases struct size, breaks ABI compatibility
+. **Info API**: Should `info::core_types()` be extended to return a count instead of/in addition to a vector?

Contributor

aleksei-fedotov Dec 5, 2025

info::core_types() returns std::vector<core_type_id>. So, if I understand the sixth question correctly, the count can be retrieved via ct = info::core_types(); ct.size().

Contributor Author

dnmokhov Dec 5, 2025

I guess I only meant "instead", the main idea being that

info::core_types() → {0, 1, ..., n-1}

is no more useful than something like

info::num_core_types() → n

rfcs/proposed/core_types/README.md

Comment on lines +66 to +68

		### New API

		Add the following methods to `tbb::task_arena::constraints`:

Contributor

aleksei-fedotov Dec 5, 2025 •

edited

Loading

Another possibility can be to leave constraints struct unchanged, but introduce, for example, a new constructor for the task_arena class that would accept vector/array of constraints instances, each bound to certain NUMA, core type and threads per core constraints. The task arena constructor would make a union of the masks resulting from each of constraints instance and use that union as its constraint.
At first glance, this design looks more flexible to me as it scales better, allowing users not only specifying more than one core type, but also more than one NUMA node. Essentially, users can specify multiple portions of the platform, whose united constraint is desired to be set for a single task_arena instance.

Contributor Author

dnmokhov Dec 5, 2025

Sounds like TCM. 😉 I will try to add this as an alternative.

Contributor

vossmjp Dec 5, 2025

Along this same line of thinking, if there is a set_core_types there should likely, eventually, be a set_num_ids function. We should consider which is easier to reason about, a combination created from a vector of constraints, or these specific functions.

vossmjp reviewed

View reviewed changes

rfcs/proposed/core_types/README.md


		### Motivation

		The current oneTBB API allows users to constrain task execution to a single core type using

Contributor

vossmjp Dec 5, 2025 •

edited

Loading

Suggested change

      
            The current oneTBB API allows users to constrain task execution to a single core type using
          
            By default, oneTBB includes all available core types in a task arena unless explicitly constrained. 
          
            The current oneTBB API allows users to constrain task execution to a single core type using

vossmjp reviewed

View reviewed changes

rfcs/proposed/core_types/README.md


		#### 1. Flexibility and Resource Utilization

		Many parallel workloads can execute efficiently on multiple core types. For example:

Contributor

vossmjp Dec 5, 2025 •

edited

Loading

Suggested change

      
            Many parallel workloads can execute efficiently on multiple core types. For example:
          
            While it is often best to allow the OS to use all core types and flexibly schedule threads, some advanced users may find it necessary to constrain scheduling. 
          
            When there are more than two core types, it may be desired to constrain execution to not just a single core type.
          
            Many parallel workloads can execute efficiently on multiple core types that make up a subset of the available core types. For example:

vossmjp reviewed

View reviewed changes

rfcs/proposed/core_types/README.md


		#### 3. Avoiding Inappropriate Core Selection

		Without the ability to specify "P-cores OR E-cores (but not LP E-cores)", applications face a dilemma:

Contributor

vossmjp Dec 5, 2025

Suggested change

      
            Without the ability to specify "P-cores OR E-cores (but not LP E-cores)", applications face a dilemma:
          
            Without the ability to specify "P-cores OR E-cores (but not LP E-cores)" or 
          
            "LP E-cores and E-cores but not P-cores" applications face dilemmas.
          
            For example, without being able to specify "P-cores OR E-cores (but not LP E-cores)":

vossmjp reviewed

View reviewed changes

rfcs/proposed/core_types/README.md

+              |----------|------|------|
+              | **P-cores only** | Maximum single-threaded performance | Leaves E-cores idle; limited parallelism; higher power |
+              | **E-cores only** | Good for parallel workloads | Doesn't utilize P-core performance; excludes LP E-cores |
+              | **LP E-cores only** | Minimal power consumption | Severe performance impact for most workloads |

Contributor

vossmjp Dec 5, 2025 •

edited

Loading

Suggested change

      
            | **LP E-cores only** | Minimal power consumption | Severe performance impact for most workloads |
          
            | **LP E-cores only** | Minimal power consumption | Severe performance impact for some workloads that require large, shared caches. |

vossmjp reviewed

View reviewed changes

Contributor

vossmjp left a comment

In general this RFC should not read like guidance on how to select which core type(s) to use based on application characteristics. The relative capabilities of core types may differ based on the HW platform and benefits will be highly application dependent. Instead, it should describe use cases with appropriate caveats that its generally better to the let the OS decide and that constraints should be applied carefully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

aleksei-fedotov aleksei-fedotov left review comments

vossmjp vossmjp left review comments

akukanov Awaiting requested review from akukanov

kboyarinov Awaiting requested review from kboyarinov

isaevil Awaiting requested review from isaevil

+1 more reviewer

wangleis wangleis left review comments

At least 1 approving review is required to merge this pull request.

Labels

RFC