Releases · llm-d-incubation/llm-d-fast-model-actuation

03 Apr 00:04

aavarghese

v0.5.1-alpha.6

fe5deda

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #6) Latest

Latest

What's Changed

Reword GPU-ful to GPU-bearing to pass typo checker by @MikeSpreitzer in #395
Consider port when selecting launcher by @waltforme in #396
Use EnvVars map instead of copying it by @MikeSpreitzer in #401
Add annotations to instances in launcher by @MikeSpreitzer in #399
Controll the GPU assignment for e2e test on OpenShift by @waltforme in #403
Begin to use annotations in VllmConfig by @waltforme in #404

Full Changelog: v0.5.1-alpha.5...v0.5.1-alpha.6

Contributors

waltforme and MikeSpreitzer

Assets 2

31 Mar 14:01

aavarghese

v0.5.1-alpha.5

162829c

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #5)

What's Changed

Null out serverDat.Sleeping when no vLLM instances associated yet by @waltforme in #359
deps(actions): bump docker/login-action from 3.7.0 to 4.0.0 by @dependabot[bot] in #323
deps(actions): bump actions/checkout from 4.2.2 to 6.0.2 by @dependabot[bot] in #324
deps(actions): bump docker/setup-buildx-action from 3.12.0 to 4.0.0 by @dependabot[bot] in #325
ci: fix actions/checkout version comments and pin by SHA by @MikeSpreitzer in #361
deps(actions): bump docker/build-push-action from 6.18.0 to 7.0.0 by @dependabot[bot] in #326
Add deploy_fma.sh and debug workflow for OCP E2E by @diegocastanibm in #357
Discontinue the usage of LauncherGeneratedBy label by @waltforme in #365
🌱 Unify launcher unit testing by @MikeSpreitzer in #368
Improve launcher logging - Part 2 by @diegocastanibm in #367
Fix: Add enable-sleep-mode flag to enable sleep mode for vllm server by @aavarghese in #376
deps(actions): bump actions/setup-go from 6.2.0 to 6.3.0 by @dependabot[bot] in #360
Include creation parameters inline in launcher instance state replies by @MikeSpreitzer in #369
🌱 Hot fix to e2e test on Openshift by @MikeSpreitzer in #382
Sync unbound launcher-based server-providing pods by @waltforme in #362
Preserve the final state at the end of the e2e test in kind by @waltforme in #390
Extract launcher E2E test scenarios into reusable script by @MikeSpreitzer in #386
Pin ko base image to chainguard/static digest by @MikeSpreitzer in #392
deps(actions): bump docker/setup-qemu-action from 3.2.0 to 4.0.0 by @dependabot[bot] in #370
deps(actions): bump actions/cache from 5.0.3 to 5.0.4 by @dependabot[bot] in #371
deps(actions): bump docker/metadata-action from 5.10.0 to 6.0.0 by @dependabot[bot] in #372
deps(go): bump the kubernetes group across 1 directory with 3 updates by @dependabot[bot] in #373
deps: bump code-generator from v0.34.2 to v0.34.6 by @MikeSpreitzer in #393
Dump logs for every container in e2e test by @waltforme in #394
Self-annotation on launcher pods to signal hosted instance changes by @waltforme in #391

Full Changelog: v0.5.1-alpha.4...v0.5.1-alpha.5

Contributors

aavarghese, waltforme, and 3 other contributors

Assets 2

13 Mar 16:26

aavarghese

v0.5.1-alpha.4

ecf0d21

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #4)

What's Changed

Adjust Node viewing ClusterRole in E2E-on-OCP workflow by @MikeSpreitzer in #313
fix: wait for CRDs to be Established in OpenShift E2E workflow by @MikeSpreitzer in #314
Test cases for multiple instances sharing one launcher pod by @waltforme in #264
deps(actions): bump actions/upload-artifact from 6.0.0 to 7.0.0 by @dependabot[bot] in #301
✨ Add workflow summary step showing gate decision by @MikeSpreitzer in #316
Improvements to the launcher-based tests by @waltforme in #319
GPU assignment for launcher-based server-providing Pods by @waltforme in #317
[DOCS] More files to align FMA with LLM-d by @diegocastanibm in #283
[DOCS] Adding governance documents by @diegocastanibm in #278
🌱 Remove per-repo gh-aw typo/link/upstream workflows by @clubanderson in #321
Launcher log improvements by @diegocastanibm in #286
Install kubernetes python library on launcher dockerfile by @manoelmarques in #328
🌱 Bump vllm-openai image to v0.15.1 by @MikeSpreitzer in #329
Rework GetInferenceServerPort by @waltforme in #330
Improve launcher's GPU mock by @waltforme in #322
Fix misplaced envar names in launcher.md by @waltforme in #344
Configure launcher for ConfigMap-based GPU UUID-to-index translation by @MikeSpreitzer in #341
Revise type VllmConfig by @waltforme in #346
Remove the dependency on the gpu-map ConfigMap for M3 code in production by @waltforme in #349
🌱 Pin E2E-on-OpenShift test to vllm-d cluster by @MikeSpreitzer in #350
Fix typos and add typos config by @MikeSpreitzer in #351
deps(go): bump the kubernetes group with 3 updates by @dependabot[bot] in #300
ci: Use real requester and launcher in OpenShift e2e by @rubambiza in #343
ci: Address review feedback on OpenShift e2e by @rubambiza in #354
✨ Add cleanup of launcher image by @MikeSpreitzer in #356
ci: Enable launcher-populator in OpenShift E2E and local tests by @aavarghese in #348

Full Changelog: v0.5.1-alpha.3...v0.5.1-alpha.4

Contributors

clubanderson, aavarghese, and 6 other contributors

Assets 2

02 Mar 16:50

aavarghese

v0.5.1-alpha.3

7eb291a

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #3)

What's Changed

ci: remove dual-pods finalizers before namespace deletion by @MikeSpreitzer in #288
ci: use literal Go build version instead of go.mod value by @MikeSpreitzer in #290
🐛 Stop the E2E on OpenShift workflow from deleting the CRDs by @MikeSpreitzer in #293
fix: upgrade vllm CPU build from v0.15.0 to v0.15.1 by @MikeSpreitzer in #294
✨ Reorg E2E on OCP workflow to always dump state by @MikeSpreitzer in #295
deps(actions): bump actions/download-artifact from 6.0.0 to 8.0.0 by @dependabot[bot] in #303
deps(actions): bump actions/github-script from 7.0.1 to 8.0.0 by @dependabot[bot] in #304
ci: upgrade docker/login-action to v3.7.0 by @MikeSpreitzer in #305
✨ New management for ValidatingAdmissionPolicy[Binding] objects by @MikeSpreitzer in #297
ci: upgrade actions/cache to v5.0.3 by @MikeSpreitzer in #306
fix: replace ClusterRoleBinding to view with namespace-scoped pods permission by @MikeSpreitzer in #310

Full Changelog: v0.5.1-alpha.2...v0.5.1-alpha.3

Contributors

MikeSpreitzer and dependabot

Assets 2

24 Feb 21:04

aavarghese

v0.5.1-alpha.2

37f7891

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #2)

What's Changed

✨ Add launcher and test workload E2E tests to OpenShift CI by @clubanderson in #262
✨ Add GitHub Agentic Workflows for typo, link, and upstream checks by @clubanderson in #255
[FEATURE] - Fetch stdout in launcher by @diegocastanibm in #242
docs: Update FMA docs with ecosystem context, milestone status, and dependencies by @rubambiza in #266
Refining ValidatingAdmissionPolicy tests to match current implementation of DP controller by @aavarghese in #234
deps(actions): bump docker/setup-buildx-action from 3.7.1 to 3.12.0 by @dependabot[bot] in #249
deps(go): bump the kubernetes group with 5 updates by @dependabot[bot] in #253
Bump docker/metadata-action, actions/setup-go, and ko-build/setup-ko by @MikeSpreitzer in #281
Bump actions/setup-python from 4 to 6.2.0 by @MikeSpreitzer in #280
deps(go): bump github.com/spf13/pflag from 1.0.6 to 1.0.10 in the go-dependencies group by @dependabot[bot] in #254
Combine helm charts by @rubambiza in #263

Full Changelog: v0.5.1-alpha...v0.5.1-alpha.2

Contributors

clubanderson, aavarghese, and 4 other contributors

Assets 2

16 Feb 17:38

aavarghese

v0.5.1-alpha

4e41cfe

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release)

This is a test release for Milestone 3, introducing launcher-based inference server management with sleep/wake capabilities for efficient GPU resource utilization and quick start-up.

Run TAG="v0.5.1-alpha"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Release v0.5.1-alpha completed successfully!
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Container Images:
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/dual-pods-controller:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/launcher-populator:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/launcher:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/requester:v0.5.1-alpha

Helm Charts (version 0.5.1-alpha):
• oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/dual-pods-controller
• oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/launcher-populator

Install with:
helm install dpctlr oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/dual-pods-controller --version 0.5.1-alpha
helm install launcher-populator oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/launcher-populator --version 0.5.1-alpha
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Assets 2

23 Oct 16:32

aavarghese

v0.0.1

0ff27b3

Milestone 1: Dual pods without sleep/wake

Merge pull request #88 from MikeSpreitzer/source-reorg

Controller source reorg

Assets 2

Releases: llm-d-incubation/llm-d-fast-model-actuation

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #6)

What's Changed

Contributors

Uh oh!

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #5)

What's Changed

Contributors

Uh oh!

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #4)

What's Changed

Contributors

Uh oh!

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #3)

What's Changed

Contributors

Uh oh!

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #2)

What's Changed

Contributors

Uh oh!

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release)

Uh oh!

Milestone 1: Dual pods without sleep/wake

Uh oh!