proposal: Multi-Cluster Queue Management#7485
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a design proposal for implementing per-tenant queue sharding within the Karmada scheduler. By shifting from global queues to namespace-scoped queues, the system aims to improve multi-tenant isolation, prevent burst monopolization, and allow for configurable ordering strategies like StrictFIFO. The proposal outlines the API design, scheduling logic, and a phased implementation plan to ensure stability and backward compatibility. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
Welcome @shellfish007! It looks like this is your first PR to karmada-io/karmada 🎉 |
There was a problem hiding this comment.
Code Review
This pull request introduces per-tenant queue sharding to Karmada's scheduler through a new namespace-scoped TenantQueue API, supporting both BestEffortFIFO and StrictFIFO strategies to improve multi-tenant isolation. Feedback includes requests to document exported types and constants per the repository style guide. Additionally, suggestions were made to improve isolation for default tenants through automatic sharding and to simplify resource management by using a singleton name for TenantQueue objects instead of relying on validation webhooks.
| QueueingStrategy QueueingStrategy `json:"queueingStrategy,omitempty"` | ||
| } | ||
|
|
||
| type QueueingStrategy string |
There was a problem hiding this comment.
According to the repository style guide (line 7) and standard Go best practices, all exported types should be documented. Please add a concise comment describing the purpose of QueueingStrategy.
| type QueueingStrategy string | |
| // QueueingStrategy defines the strategy for ordering and blocking bindings in the active queue.\ntype QueueingStrategy string |
References
- All exported functions, methods, structs, and interfaces must be documented with clear and concise comments describing their purpose and behavior. (link)
| BestEffortFIFO QueueingStrategy = "BestEffortFIFO" | ||
| StrictFIFO QueueingStrategy = "StrictFIFO" | ||
| ) |
There was a problem hiding this comment.
Exported constants should be documented to adhere to the repository style guide (line 7) and standard Go best practices.
| BestEffortFIFO QueueingStrategy = "BestEffortFIFO" | |
| StrictFIFO QueueingStrategy = "StrictFIFO" | |
| ) | |
| // BestEffortFIFO indicates that if the head binding fails, the next one is tried.\n BestEffortFIFO QueueingStrategy = "BestEffortFIFO"\n // StrictFIFO indicates that if the head binding fails, the entire queue is blocked.\n StrictFIFO QueueingStrategy = "StrictFIFO"\n) |
References
- All exported functions, methods, structs, and interfaces must be documented with clear and concise comments describing their purpose and behavior. (link)
| # Another namespace uses the default (BestEffortFIFO), no TenantQueue needed | ||
| ``` | ||
|
|
||
| Namespaces without a `TenantQueue` — as well as all `ClusterResourceBinding` objects (which have no namespace) — are routed to a built-in `__default__` queue that always uses `BestEffortFIFO`. The default queue participates in the same round-robin as named tenant queues, getting one scheduling turn per cycle. |
There was a problem hiding this comment.
The proposal states that all namespaces without a TenantQueue are routed to a single __default__ queue. This design does not provide isolation between these 'default' tenants, which may conflict with the goal of 'per-tenant isolation'. \n\nConsider automatically sharding by namespace name by default for all ResourceBinding objects. The TenantQueue resource would then serve as an optional configuration for these per-namespace shards, rather than a prerequisite for isolation.
| ### Phase 3: Stabilization (Beta) | ||
|
|
||
| 1. Promote `TenantQueue` API to `v1beta1`. | ||
| 2. Add validation webhooks (reject multiple `TenantQueue` objects per namespace). |
There was a problem hiding this comment.
c3677e5 to
ca1ecca
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new scheduling proposal describing “Multi-Cluster Queue Management” via per-namespace (tenant) sharded scheduler queues, aiming to improve multi-tenant isolation and fairness without introducing heavier queue abstractions.
Changes:
- Introduces a new proposal document for per-tenant queue sharding in the scheduler (active/backoff/unschedulable queues).
- Specifies a new namespaced
TenantQueueAPI concept withBestEffortFIFOandStrictFIFOmodes. - Describes a Kueue-inspired “heads” collection pattern for cross-tenant fairness.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - **`unschedulableBindings`** — bindings that could not be scheduled and are awaiting a cluster state change | ||
|
|
||
| Today these three queues are global singletons. This proposal makes them **per-tenant**, and introduces a namespace-scoped `TenantQueue` API object that configures queue settings for a namespace. Since tenant = namespace = `FederatedResourceQuota` scope, no separate namespace selector is needed — one `TenantQueue` per namespace governs the queue behavior for all `ResourceBinding` objects in that namespace. | ||
|
|
||
| --- | ||
|
|
||
| ## Motivation | ||
|
|
| ```go | ||
| // +genclient | ||
| // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object | ||
| // +kubebuilder:resource:path=tenantqueues,scope=Namespaced,shortName=tq,categories={karmada-io} |
|
|
||
| HOL blocking is tracked via a `blocked bool` flag on the tenant entry. The flag is cleared by an `onActiveQPush` callback on the inner queue, which fires whenever a binding is moved back to `activeQ` (backoff expiry, unschedulable flush, cluster state change). | ||
|
|
||
| --- |
| | Throughput | Higher | Lower (head-of-line blocking) | | ||
| | Ordering guarantee | Best effort | Deterministic within tenant | | ||
| | Typical use case | Interactive / heterogeneous batch | Sequential pipelines, strict ordering | | ||
|
|
||
| --- | ||
|
|
||
| ## Design Notes | ||
|
|
- Add doc comments to QueueingStrategy type and constants - Clarify queue isolation is opt-in (namespaces without TenantQueue share default) - Enforce singleton name 'queue' instead of validation webhook - Fix "creation timestamp" to "enqueue timestamp" for ordering semantics
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7485 +/- ##
==========================================
- Coverage 42.16% 41.93% -0.24%
==========================================
Files 876 879 +3
Lines 64968 54328 -10640
==========================================
- Hits 27395 22780 -4615
+ Misses 35874 29826 -6048
- Partials 1699 1722 +23
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/assign |
|
/assign |
Summary
This PR proposes opt-in per-tenant queue sharding for Karmada's existing scheduler queue system, enabling multi-tenant isolation without introducing new heavyweight abstractions.
Karmada's scheduler maintains three internal queues (
activeQ,backoffQ,unschedulableBindings) as global singletons. Namespaces that create aTenantQueueget their own isolated set of queues; namespaces without one continue to share a global default queue (backward compatible).API
TenantQueueis namespace-scoped with a singleton namequeue. A validating webhook rejects objects with any other name.Scheduler Changes
The scheduler maintains a
TenantSchedulingQueuewrapping multipleprioritySchedulingQueueinstances, one per namespace:Pop()uses round-robin across tenant queues for fair scheduling. Bindings are ordered by priority descending, then enqueue timestamp ascending.Key Points
BestEffortFIFO(skip unschedulable head, try next) andStrictFIFO(head-of-line blocking per tenant)TenantQueueManagement(alpha, disabled by default)queueenforced by validating webhookNon-Goals
backoffQorunschedulableBindingsdata structures themselvesRelated