Skip to content

Releases: llm-d-incubation/llm-d-fast-model-actuation

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #6)

03 Apr 00:04
fe5deda

Choose a tag to compare

What's Changed

Full Changelog: v0.5.1-alpha.5...v0.5.1-alpha.6

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #5)

31 Mar 14:01
162829c

Choose a tag to compare

What's Changed

  • Null out serverDat.Sleeping when no vLLM instances associated yet by @waltforme in #359
  • deps(actions): bump docker/login-action from 3.7.0 to 4.0.0 by @dependabot[bot] in #323
  • deps(actions): bump actions/checkout from 4.2.2 to 6.0.2 by @dependabot[bot] in #324
  • deps(actions): bump docker/setup-buildx-action from 3.12.0 to 4.0.0 by @dependabot[bot] in #325
  • ci: fix actions/checkout version comments and pin by SHA by @MikeSpreitzer in #361
  • deps(actions): bump docker/build-push-action from 6.18.0 to 7.0.0 by @dependabot[bot] in #326
  • Add deploy_fma.sh and debug workflow for OCP E2E by @diegocastanibm in #357
  • Discontinue the usage of LauncherGeneratedBy label by @waltforme in #365
  • 🌱 Unify launcher unit testing by @MikeSpreitzer in #368
  • Improve launcher logging - Part 2 by @diegocastanibm in #367
  • Fix: Add enable-sleep-mode flag to enable sleep mode for vllm server by @aavarghese in #376
  • deps(actions): bump actions/setup-go from 6.2.0 to 6.3.0 by @dependabot[bot] in #360
  • Include creation parameters inline in launcher instance state replies by @MikeSpreitzer in #369
  • 🌱 Hot fix to e2e test on Openshift by @MikeSpreitzer in #382
  • Sync unbound launcher-based server-providing pods by @waltforme in #362
  • Preserve the final state at the end of the e2e test in kind by @waltforme in #390
  • Extract launcher E2E test scenarios into reusable script by @MikeSpreitzer in #386
  • Pin ko base image to chainguard/static digest by @MikeSpreitzer in #392
  • deps(actions): bump docker/setup-qemu-action from 3.2.0 to 4.0.0 by @dependabot[bot] in #370
  • deps(actions): bump actions/cache from 5.0.3 to 5.0.4 by @dependabot[bot] in #371
  • deps(actions): bump docker/metadata-action from 5.10.0 to 6.0.0 by @dependabot[bot] in #372
  • deps(go): bump the kubernetes group across 1 directory with 3 updates by @dependabot[bot] in #373
  • deps: bump code-generator from v0.34.2 to v0.34.6 by @MikeSpreitzer in #393
  • Dump logs for every container in e2e test by @waltforme in #394
  • Self-annotation on launcher pods to signal hosted instance changes by @waltforme in #391

Full Changelog: v0.5.1-alpha.4...v0.5.1-alpha.5

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #4)

13 Mar 16:26
ecf0d21

Choose a tag to compare

What's Changed

Full Changelog: v0.5.1-alpha.3...v0.5.1-alpha.4

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #3)

02 Mar 16:50
7eb291a

Choose a tag to compare

What's Changed

Full Changelog: v0.5.1-alpha.2...v0.5.1-alpha.3

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release #2)

24 Feb 21:04

Choose a tag to compare

What's Changed

  • ✨ Add launcher and test workload E2E tests to OpenShift CI by @clubanderson in #262
  • ✨ Add GitHub Agentic Workflows for typo, link, and upstream checks by @clubanderson in #255
  • [FEATURE] - Fetch stdout in launcher by @diegocastanibm in #242
  • docs: Update FMA docs with ecosystem context, milestone status, and dependencies by @rubambiza in #266
  • Refining ValidatingAdmissionPolicy tests to match current implementation of DP controller by @aavarghese in #234
  • deps(actions): bump docker/setup-buildx-action from 3.7.1 to 3.12.0 by @dependabot[bot] in #249
  • deps(go): bump the kubernetes group with 5 updates by @dependabot[bot] in #253
  • Bump docker/metadata-action, actions/setup-go, and ko-build/setup-ko by @MikeSpreitzer in #281
  • Bump actions/setup-python from 4 to 6.2.0 by @MikeSpreitzer in #280
  • deps(go): bump github.com/spf13/pflag from 1.0.6 to 1.0.10 in the go-dependencies group by @dependabot[bot] in #254
  • Combine helm charts by @rubambiza in #263

Full Changelog: v0.5.1-alpha...v0.5.1-alpha.2

Milestone 3 - Launcher-Based Inference with Sleep/Wake (Test Release)

16 Feb 17:38
4e41cfe

Choose a tag to compare

This is a test release for Milestone 3, introducing launcher-based inference server management with sleep/wake capabilities for efficient GPU resource utilization and quick start-up.

Run TAG="v0.5.1-alpha"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Release v0.5.1-alpha completed successfully!
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Container Images:
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/dual-pods-controller:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/launcher-populator:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/launcher:v0.5.1-alpha
• ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/requester:v0.5.1-alpha

Helm Charts (version 0.5.1-alpha):
• oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/dual-pods-controller
• oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/launcher-populator

Install with:
helm install dpctlr oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/dual-pods-controller --version 0.5.1-alpha
helm install launcher-populator oci://ghcr.io/llm-d-incubation/llm-d-fast-model-actuation/charts/launcher-populator --version 0.5.1-alpha
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Milestone 1: Dual pods without sleep/wake

23 Oct 16:32
0ff27b3

Choose a tag to compare

Merge pull request #88 from MikeSpreitzer/source-reorg

Controller source reorg