Update Device Split Semantics

Assume the following scenario: 
* A compute node with a NUMA domain
* The NUMA domain has `n` cores and two devices (GPUs)
* There are two tasks that want to split the hardware (user scope) on a GPU basis: 
```
qv_scope_split_at(ctx, user_scope, QV_HW_OBJ_GPU, rank%ngpus, &gpu_scope);
```
Currently, the split operation results in two sub-scopes: 
1. GPU 0 and all n cores
2. GPU 1 and all n cores 

Note that the n cores are shared in both scopes. 
Trying to split the sub-scopes over the tasks to get exclusive cores is not possible because we cannot apply a collective split operation over *different* sub-scopes.

This issue can be addressed by maintaining a *list* of exclusive resources associated with each device (e.g., cpuset). In this case, GPU 0 would have half of the cores in its resource list and GPU 1 would have the other half of the cores in its list. With such internal distribution of resources, the call to `qv_scope_split_at` above would result in the following two sub-scopes: 
1. GPU 0 and `n/2` cores
2. GPU 1 and the other `n/2` cores 

At this point there is no need for an additional split operation to get exclusive cores associated with a GPU scope. 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update Device Split Semantics #104

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Update Device Split Semantics #104

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions