Release Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release) · llm-d-incubation/llm-d-fast-model-actuation

This is a test release for Milestone 3, introducing launcher-based inference server management with sleep/wake capabilities for efficient GPU resource utilization and quick start-up.

Run TAG="v0.5.1-alpha"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Release v0.5.1-alpha completed successfully!
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Container Images:
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/dual-pods-controller:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/launcher-populator:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/launcher:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/requester:v0.5.1-alpha

Helm Charts (version 0.5.1-alpha):
• oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/dual-pods-controller
• oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/launcher-populator

Install with:
helm install dpctlr oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/dual-pods-controller --version 0.5.1-alpha
helm install launcher-populator oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/launcher-populator --version 0.5.1-alpha
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release)

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!