StrictFIFO blocks workloads requesting different ResourceFlavors in same ClusterQueue

**What happened**:

We have a ClusterQueue configured with queueingStrategy: StrictFIFO and multiple ResourceFlavors (B200 and B300 GPU pools) with independent quotas. A workload requesting B300 GPUs (position 0 in queue) cannot be admitted because B300 is fully utilized. However, this single stuck workload is blocking all subsequent workloads requesting B200 GPUs, even though B200 has 40 available GPUs.

_Observed queue state:_
  - Position 0: job-1  (B300-raid0, 8 GPUs) - Cannot admit (B300: 16/16 used)
  - Position 1: job-2  (B200-raid0, 8 GPUs) - BLOCKED, never evaluated
  - Position 2: job-3   (B200-raid0)         - BLOCKED, never evaluated
  - Position 3: job-4  (B200-raid0, 2 GPUs) - BLOCKED, never evaluated

_Evidence:_
- Only the head workload (position 0) appears in Kueue controller logs
- Zero log entries exist for positions 1-3 workloads
- B200 has 0/40 GPUs used, B300 has 16/16 GPUs used

Controller log shows only the head workload:
```json
{"msg":"couldn't assign flavors to pod set main: insufficient unused quota for nvidia.com/gpu in flavor pool-b300, 8 more needed", "object":{"name":"job-1"}}
```
**What you expected to happen**:

- Option A (per-flavor FIFO): Workloads requesting B200 should be evaluated and admitted even when the B300 workload at the head cannot be admitted, since they're using completely independent resource pools.
- Option B (current global FIFO): If cross-flavor blocking is intentional, this should be clearly documented as it has significant operational implications.

**How to reproduce it (as minimally and precisely as possible)**:

1. Create a ClusterQueue with StrictFIFO and multiple flavors:
```yaml
apiVersion: kueue.x-k8s.io/v1beta1
kind: ClusterQueue
metadata:
  name: test-cq
  spec:
  queueingStrategy: StrictFIFO
  resourceGroups:
  - coveredResources:
    - nvidia.com/gpu
    flavors:
    - name: flavor-a
      resources:
      - name: nvidia.com/gpu
         nominalQuota: "16"
    - name: flavor-b
      resources:
      - name: nvidia.com/gpu
        nominalQuota: "40"
```
2. Submit a workload requesting flavor-a (will consume all 16 GPUs)
3. Submit workload-1 requesting flavor-a (8 GPUs) - will be blocked due to no capacity
4. Submit workload-2 requesting flavor-b (8 GPUs) - should fit but will be blocked
5. Observe that workload-2 never gets evaluated despite flavor-b having 40 available GPUs

Code reference:
From `pkg/cache/queue/manager.go:688-710`, the scheduler pops ONE workload per ClusterQueue per cycle:
```go
func (m *Manager) heads() []workload.Info {
    for cqName, cq := range m.hm.ClusterQueues() {
        wl := cq.Pop()  // Pops only one workload per CQ per cycle
    }
}
```

With StrictFIFO, Pop() returns workloads in strict creation order regardless of ResourceFlavor.


**Anything else we need to know?**:

_Impact:_
- Head-of-line blocking across independent resource pools
- Poor resource utilization (40 idle B200 GPUs while jobs wait)
- Defeats the purpose of multiple flavors with separate quotas

_Documentation gap:_
- The current docs state: "Older workloads that can't be admitted will block newer workloads, even if the newer workloads fit in the available quota"

_This is ambiguous regarding whether blocking applies:_
- Within the same flavor only, or
- Across all flavors in the ClusterQueue

_Questions:_
1. Is cross-flavor blocking the intended behavior of StrictFIFO?
2. If yes, what's the recommended architecture for multi-flavor setups requiring FIFO ordering?
3. Should separate ClusterQueues be created per flavor family to avoid this?
4. Would a per-flavor FIFO mode be considered as a feature enhancement?

_Workarounds considered:_
  - BestEffortFIFO: Loses strict ordering guarantees
  - Separate ClusterQueues per flavor: Management overhead, prevents inter-flavor borrowing
  - Delete blocking workload: Not operationally sustainable

**Environment**:
- Kubernetes version: v1.31.13-eks-ecaa3a6
- Kueue version: v1beta1 (ClusterQueue API version)
- Cloud provider: AWS EKS
- Hardware: p6-b200.48xlarge (B200 GPUs), p6-b300.48xlarge (B300 GPUs)
- OS: Amazon Linux 2023
- Install tools: Deployed via ArgoCD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

StrictFIFO blocks workloads requesting different ResourceFlavors in same ClusterQueue #8309

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

StrictFIFO blocks workloads requesting different ResourceFlavors in same ClusterQueue #8309

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions