Skip to content

Add pool-level gates#276

Draft
jtechapps wants to merge 1 commit into
llm-d-incubation:mainfrom
jtechapps:poolgates
Draft

Add pool-level gates#276
jtechapps wants to merge 1 commit into
llm-d-incubation:mainfrom
jtechapps:poolgates

Conversation

@jtechapps

Copy link
Copy Markdown
Contributor

What does this PR do?

Introduces a pool-level admission control gating mechanism. This allows worker pools to block/park incoming requests in-memory when capacity is reached, preventing expensive broker nack-and-retry cycles.

Why is this change needed?

Fixes: #208

How was this tested?

  • Unit tests added/updated
  • Integration/e2e tests added/updated
  • Manual testing performed

Checklist

  • Commits are signed off (git commit -s) per DCO
  • Code follows project contributing guidelines
  • Tests pass locally (make test)
  • Linters pass (make lint)
  • Documentation updated (if applicable)

Related Issues

#208

Introduces a pool-level admission control gating mechanism.
This allows worker pools to block/park incoming requests in-memory when
capacity is reached, preventing expensive broker nack-and-retry cycles.

Signed-off-by: Jacob Murry <jacobmurry@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Pool-Level Admission Control

1 participant