Skip to content

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release)

Choose a tag to compare

@aavarghese aavarghese released this 16 Feb 17:38
· 99 commits to main since this release
4e41cfe

This is a test release for Milestone 3, introducing launcher-based inference server management with sleep/wake capabilities for efficient GPU resource utilization and quick start-up.

Run TAG="v0.5.1-alpha"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Release v0.5.1-alpha completed successfully!
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Container Images:
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/dual-pods-controller:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/launcher-populator:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/launcher:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/requester:v0.5.1-alpha

Helm Charts (version 0.5.1-alpha):
• oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/dual-pods-controller
• oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/launcher-populator

Install with:
helm install dpctlr oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/dual-pods-controller --version 0.5.1-alpha
helm install launcher-populator oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/launcher-populator --version 0.5.1-alpha
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━