DRA: Derived Attributes#6081
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gauravkghildiyal The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
217ed13 to
7d68c53
Compare
7d68c53 to
df713b1
Compare
| - kube-apiserver | ||
| - kube-controller-manager | ||
| - kube-scheduler | ||
| - kubelet |
There was a problem hiding this comment.
I doubt kubelet needs to know about these
There was a problem hiding this comment.
Makes sense removed.
(I'm not too sure about kube-controller-manager either, but maybe its needed there because it clones the ResourceClaimTemplate to ResourceClaims)
| ## Proposal | ||
|
|
||
| We propose extending `.devices.requests` with `derivedAttributes` and | ||
| `.devices.constraints` with `matchDerivedAttribute`. A derived attribute defines |
There was a problem hiding this comment.
We should not change the constraints. These should just look like any other attribute to the entire constraints section. The only API addition needed should be the derivedAttributes field.
There was a problem hiding this comment.
Great idea! Even better.
Updated.
|
@thameem-abbas I think this might be useful for your use cases. |
|
cc @pohly |
|
/cc |
pohly
left a comment
There was a problem hiding this comment.
Looks okay to me.
Shadow API review: we need to sort out some details, see comments.
| - **Risk**: A new CEL expression needs to be evaluated for each candidate device | ||
| (in addition to any CEL expressions evaluated for device selectors). | ||
| - **Mitigation**: This evaluation adds a constant time increment to each device | ||
| evaluation. The remainder of the scheduling cycle—including constraint |
There was a problem hiding this comment.
It's constant per device, but each device might be considered more than once. Caching derived attributes mitigates that.
| type DerivedAttribute struct { | ||
| // Name is the identifier for this derived attribute, used in constraints. | ||
| // | ||
| // +required |
There was a problem hiding this comment.
Document the valid string format here.
| // for each candidate device. | ||
| // +featureGate=DRADerivedAttributes | ||
| // +optional | ||
| DerivedAttributes []DerivedAttribute `json:"derivedAttributes,omitempty"` |
|
|
||
| // Expression is a CEL expression evaluated against each candidate device. | ||
| // The expression must evaluate to a primitive scalar (string, integer, | ||
| // boolean) or a list of scalars ([]string, []int64, []bool) to act as a |
There was a problem hiding this comment.
We also have semver. Are those valid results?
| - **Environment**: The CEL environment for `Expression` is exactly the same | ||
| as that for `CELDeviceSelector`, containing a single variable `device`. | ||
| - **Return Type**: The CEL expression must evaluate to a scalar (string, | ||
| integer, boolean) or a list of scalars (`[]string`, `[]int64`, `[]bool`). |
There was a problem hiding this comment.
This "must evaluate to" cannot be validated during admission time (depends on unknown input types).
How are evaluation errors at runtime handled?
| // of the attributes which were prefixed by "dra.example.com"). | ||
| // - capacity (map[string]object): the device's capacities, grouped by prefix. | ||
| // | ||
| // +required |
| - "@gauravkghildiyal" | ||
| owning-sig: sig-scheduling | ||
| participating-sigs: | ||
| - sig-network |
| object. The resulting value acts as a virtual attribute that can be referenced | ||
| directly by existing `.devices.constraints[].matchAttribute` fields. | ||
|
|
||
| ### Core Manifest Example: GPU & NIC NUMA Alignment |
| syntactic boundary between static (FQDN) and derived (bare) attributes, | ||
| eliminating shadowing risks but preventing direct attribute overrides. | ||
|
|
||
| Recommended: **Option 1 (Allowing FQDNs)**. Most manifest authors will naturally |
There was a problem hiding this comment.
I also think that Option 2 will break existing users of the well defined pcieRoot standard attribute, if a new driver that does not implement the standard attribute want to use it I think it should have to override it with the FQDN
yongruilin
left a comment
There was a problem hiding this comment.
Declarative Validation plans to support validation for new API fields without handwritten validation counterpart directly.
But for complicated validation(e.g. CEL compilable), might still need to reply on handwritten.
/cc @jpbetz
| // +optional | ||
| DerivedAttributes []DerivedAttribute `json:"derivedAttributes,omitempty"` |
There was a problem hiding this comment.
| // +optional | |
| DerivedAttributes []DerivedAttribute `json:"derivedAttributes,omitempty"` | |
| // +optional | |
| // +k8s:optional | |
| // +k8s:maxItems=<int> | |
| DerivedAttributes []DerivedAttribute `json:"derivedAttributes,omitempty"` |
|
/cc |
derivedAttributes) toDeviceRequestto synthesize virtual grouping keys on the fly./sig scheduling
/wg device-management