Skip to content

Using featureGates to enable EPP flowControl feature#973

Merged
dumb0002 merged 1 commit intollm-d:mainfrom
dumb0002:epp-v0.7.0
Apr 6, 2026
Merged

Using featureGates to enable EPP flowControl feature#973
dumb0002 merged 1 commit intollm-d:mainfrom
dumb0002:epp-v0.7.0

Conversation

@dumb0002
Copy link
Copy Markdown
Collaborator

@dumb0002 dumb0002 commented Apr 3, 2026

This PR improves the E2E test with the following updates:

  • Update inference scheduler to use the latest release: v0.7.0
  • Patch inference scheduler image only if differ from the value set by env var LLM_D_INFERENCE_SCHEDULER_IMG
  • Use featureGates to enable EPP flowControl feature (remove configuration via var ENABLE_EXPERIMENTAL_FLOW_CONTROL_LAYER)
  • Update docs to remove reference to the obsolete var ENABLE_EXPERIMENTAL_FLOW_CONTROL_LAYER

This PR is addressing the issues reported in the following PR: #968

Copilot AI review requested due to automatic review settings April 3, 2026 14:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR switches EPP “flow control” enablement from an environment variable toggle to a flowControl feature gate configured via the EPP ConfigMap, and updates docs + deployment defaults accordingly.

Changes:

  • Update docs to describe enabling EPP flow control via the flowControl feature gate.
  • Update deploy logic to patch the EPP image (when needed) and enable flowControl in the EPP ConfigMap for scale-to-zero/e2e setups.
  • Bump the default LLM_D_INFERENCE_SCHEDULER_IMG version in deploy/install.sh.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
docs/user-guide/scale-from-zero.md Updates prerequisites to reference flowControl feature gate instead of env var.
docs/developer-guide/testing.md Updates testing guidance to match new flowControl enablement mechanism.
deploy/lib/infra_llmd.sh Implements ConfigMap-based flowControl enablement + conditional image patching for EPP.
deploy/install.sh Updates default EPP image tag used for flowControl-capable scheduler.

Signed-off-by: Braulio Dumba <Braulio.Dumba@ibm.com>
@dumb0002
Copy link
Copy Markdown
Collaborator Author

dumb0002 commented Apr 3, 2026

/ok-to-test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

🚀 Kind E2E (full) triggered by /ok-to-test

View the Kind E2E workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

🚀 OpenShift E2E — approve and run (/ok-to-test)

View the OpenShift E2E workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

GPU Pre-flight Check ✅

GPUs are available for e2e-openshift tests. Proceeding with deployment.

Resource Total Allocated Available
GPUs 50 36 14
Cluster Value
Nodes 16 (7 with GPUs)
Total CPU 993 cores
Total Memory 10383 Gi
GPUs required 4 (min) / 6 (recommended)

@dumb0002
Copy link
Copy Markdown
Collaborator Author

dumb0002 commented Apr 3, 2026

/ok-to-test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

🚀 Kind E2E (full) triggered by /ok-to-test

View the Kind E2E workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

🚀 OpenShift E2E — approve and run (/ok-to-test)

View the OpenShift E2E workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

GPU Pre-flight Check ✅

GPUs are available for e2e-openshift tests. Proceeding with deployment.

Resource Total Allocated Available
GPUs 50 32 18
Cluster Value
Nodes 16 (7 with GPUs)
Total CPU 993 cores
Total Memory 10383 Gi
GPUs required 4 (min) / 6 (recommended)

@dumb0002
Copy link
Copy Markdown
Collaborator Author

dumb0002 commented Apr 6, 2026

/ok-to-test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

🚀 Kind E2E (full) triggered by /ok-to-test

View the Kind E2E workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

🚀 OpenShift E2E — approve and run (/ok-to-test)

View the OpenShift E2E workflow run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

GPU Pre-flight Check ✅

GPUs are available for e2e-openshift tests. Proceeding with deployment.

Resource Total Allocated Available
GPUs 50 36 14
Cluster Value
Nodes 16 (7 with GPUs)
Total CPU 993 cores
Total Memory 10383 Gi
GPUs required 4 (min) / 6 (recommended)

@dumb0002 dumb0002 merged commit 79e72a8 into llm-d:main Apr 6, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants