Releases: llm-d/llm-d-inference-scheduler
v0.5.0
Docker image is available at:
docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.5.0Notable
- Prefill/Decode disaggregation awareness in filters and scorers
- Support for data parallel serving validated with vLLM and inference-sim
- Various CICD enhancements
- IGW: Flow control scale from/to zero support
- IGW: Standalone EPP
What's Changed
- Fix for flaky sites during lychee md link checker by @pierDipi in #485
- deps(actions): bump actions/checkout from 5 to 6 by @dependabot[bot] in #487
- deps(go): bump google.golang.org/grpc from 1.76.0 to 1.77.0 in the go-dependencies group by @dependabot[bot] in #486
- Use kv-cache-manager based on Go mod version instead of hardcoded by @pierDipi in #484
- update llm-d-kv-cache version to v0.4.0 by @vMaroon in #492
- fix: github action missing Trivy scan on sidecar image by @zdtsw in #481
- [Fix] Enhance macOS Makefile to Support Non-Homebrew Python Installations by @hyeongyun0916 in #489
- feat(allowlist): support both v1 and v1alpha2 InferencePool APIs with flag by @googs1025 in #474
- fix: make 'install-dependencies' and 'build' target by @zdtsw in #493
- skip lint and test when only docs change by @setsunakute in #494
- fix: Fixes for Data Parallel support when also running with Prefix Disaggregation by @shmuelk in #498
- sync gie to v1.2.0 by @nirrozenbaum in #499
- Error if PYTHON_CONFIG is empty by @elevran in #497
- Add GH action to check for signed and verified commits in PR by @elevran in #500
- deps(actions): bump crate-ci/typos from 1.39.2 to 1.40.0 by @dependabot[bot] in #501
- build: make build should use CGO_CFLAGS, CGO_LDFLAGS by @evacchi in #503
- chore: bump gie to v1.2.1 by @nirrozenbaum in #504
- deps(go): bump sigs.k8s.io/gateway-api from 1.4.0 to 1.4.1 in the kubernetes group by @dependabot[bot] in #508
- deps(go): bump the go-dependencies group with 3 updates by @dependabot[bot] in #507
- Miscellaneous dependency updates by @shmuelk in #510
- deps(go): bump the kubernetes group with 5 updates by @dependabot[bot] in #513
- Fix running
make env-dev-kindby @acardace in #512 - test: add precise_prefix_cache_test by @evacchi in #505
- test: reuse upstream data store and enable logr in unit tests by @MregXN in #518
- feat: allow pd_profile_handler to handle diverse plugin types by @hyeongyun0916 in #516
- deps(actions): bump crate-ci/typos from 1.40.0 to 1.40.1 by @dependabot[bot] in #526
- deps(go): bump google.golang.org/grpc from 1.77.0 to 1.78.0 in the go-dependencies group by @dependabot[bot] in #527
- feat(metrics): add model_name label to PD decision metric by @googs1025 in #528
- deps(actions): bump crate-ci/typos from 1.40.1 to 1.41.0 by @dependabot[bot] in #532
- Configure dependabot ignores Go version updates by @elevran in #533
- Updates the architecture description by @davidbreitgand in #525
- Dependabot: exert finer control over package updates by @elevran in #542
- port auto-assign action from llm-d-kv-cache by @vMaroon in #551
- refactor: set python version and pin docker image with tag by @zdtsw in #543
- chore(test): update API version for nixl test by @zdtsw in #555
- deps(go): bump the go-dependencies group with 2 updates by @dependabot[bot] in #558
- deps(actions): bump crate-ci/typos from 1.41.0 to 1.42.0 by @dependabot[bot] in #557
- deps(actions): bump actions/checkout from 4 to 6 by @dependabot[bot] in #556
- Update auto-assign logic by @elevran in #560
- Remove newline in unsigned commit message by @elevran in #561
- bump gie to v1.3.0 rc2 by @nirrozenbaum in #562
- Update OWNERS by @elevran in #559
- refactor: Makefile, update docs by @zdtsw in #463
- feat: add metrics validation in e2e test by @googs1025 in #529
- feat: make no-hit-lru P/D-aware by @evacchi in #522
- Update disaggregated Prefill/Decode inference serving documentation by @mayabar in #571
- deps(actions): bump crate-ci/typos from 1.42.0 to 1.42.1 by @dependabot[bot] in #572
- deps(go): bump github.com/onsi/ginkgo/v2 from 2.27.4 to 2.27.5 in the go-dependencies group by @dependabot[bot] in #573
- fix reviewers auto assign minor bug by @nirrozenbaum in #575
- fix(scorer): make active request pd aware by @kyanokashi in #569
- test(e2e): cleanup kind cluster by @zdtsw in #563
- refactor: add early validation in DP profile handler by @zdtsw in #554
- deps(go): bump the kubernetes group with 2 updates by @dependabot[bot] in #574
- refactor: kv cache manager repo by @sagearc in #570
- bumping IGW version to the full released version by @kfswain in #583
- Enable prefix-cache awareness in active-active multi-replica scheduler deployments by @vMaroon in #578
- Switch to pre-built vLLM wheels for CPU builds by @sagearc in #582
- update llm-d-kv-cache import to v0.5.0-RC1 by @vMaroon in #584
- Use 1.3.0 CRDs by @shmuelk in #586
Updates in Inference Gateway Extension v1.3.0
llm-d-inference-scheduler v0.5.0 has been updated to use the latest version of the Inference Gateway Extension which is 1.3.0.
You can see those changes here
New Contributors
- @setsunakute made their first contribution in #494
- @evacchi made their first contribution in #503
- @acardace made their first contribution in #512
- @MregXN made their first contribution in #518
- @davidbreitgand made their first contribution in #525
- @kyanokashi made their first contribution in #569
- @sagearc made their first contribution in #570
Full Changelog: v0.4.0...v0.5.0
v0.5.0-rc.1
What's Changed
- Fix for flaky sites during lychee md link checker by @pierDipi in #485
- deps(actions): bump actions/checkout from 5 to 6 by @dependabot[bot] in #487
- deps(go): bump google.golang.org/grpc from 1.76.0 to 1.77.0 in the go-dependencies group by @dependabot[bot] in #486
- Use kv-cache-manager based on Go mod version instead of hardcoded by @pierDipi in #484
- update llm-d-kv-cache version to v0.4.0 by @vMaroon in #492
- fix: github action missing Trivy scan on sidecar image by @zdtsw in #481
- [Fix] Enhance macOS Makefile to Support Non-Homebrew Python Installations by @hyeongyun0916 in #489
- feat(allowlist): support both v1 and v1alpha2 InferencePool APIs with flag by @googs1025 in #474
- fix: make 'install-dependencies' and 'build' target by @zdtsw in #493
- skip lint and test when only docs change by @setsunakute in #494
- fix: Fixes for Data Parallel support when also running with Prefix Disaggregation by @shmuelk in #498
- sync gie to v1.2.0 by @nirrozenbaum in #499
- Error if PYTHON_CONFIG is empty by @elevran in #497
- Add GH action to check for signed and verified commits in PR by @elevran in #500
- deps(actions): bump crate-ci/typos from 1.39.2 to 1.40.0 by @dependabot[bot] in #501
- build: make build should use CGO_CFLAGS, CGO_LDFLAGS by @evacchi in #503
- chore: bump gie to v1.2.1 by @nirrozenbaum in #504
- deps(go): bump sigs.k8s.io/gateway-api from 1.4.0 to 1.4.1 in the kubernetes group by @dependabot[bot] in #508
- deps(go): bump the go-dependencies group with 3 updates by @dependabot[bot] in #507
- Miscellaneous dependency updates by @shmuelk in #510
- deps(go): bump the kubernetes group with 5 updates by @dependabot[bot] in #513
- Fix running
make env-dev-kindby @acardace in #512 - test: add precise_prefix_cache_test by @evacchi in #505
- test: reuse upstream data store and enable logr in unit tests by @MregXN in #518
- feat: allow pd_profile_handler to handle diverse plugin types by @hyeongyun0916 in #516
- deps(actions): bump crate-ci/typos from 1.40.0 to 1.40.1 by @dependabot[bot] in #526
- deps(go): bump google.golang.org/grpc from 1.77.0 to 1.78.0 in the go-dependencies group by @dependabot[bot] in #527
- feat(metrics): add model_name label to PD decision metric by @googs1025 in #528
- deps(actions): bump crate-ci/typos from 1.40.1 to 1.41.0 by @dependabot[bot] in #532
- Configure dependabot ignores Go version updates by @elevran in #533
- Updates the architecture description by @davidbreitgand in #525
- Dependabot: exert finer control over package updates by @elevran in #542
- port auto-assign action from llm-d-kv-cache by @vMaroon in #551
- refactor: set python version and pin docker image with tag by @zdtsw in #543
- chore(test): update API version for nixl test by @zdtsw in #555
- deps(go): bump the go-dependencies group with 2 updates by @dependabot[bot] in #558
- deps(actions): bump crate-ci/typos from 1.41.0 to 1.42.0 by @dependabot[bot] in #557
- deps(actions): bump actions/checkout from 4 to 6 by @dependabot[bot] in #556
- Update auto-assign logic by @elevran in #560
- Remove newline in unsigned commit message by @elevran in #561
- bump gie to v1.3.0 rc2 by @nirrozenbaum in #562
- Update OWNERS by @elevran in #559
- refactor: Makefile, update docs by @zdtsw in #463
- feat: add metrics validation in e2e test by @googs1025 in #529
- feat: make no-hit-lru P/D-aware by @evacchi in #522
- Update disaggregated Prefill/Decode inference serving documentation by @mayabar in #571
- deps(actions): bump crate-ci/typos from 1.42.0 to 1.42.1 by @dependabot[bot] in #572
- deps(go): bump github.com/onsi/ginkgo/v2 from 2.27.4 to 2.27.5 in the go-dependencies group by @dependabot[bot] in #573
- fix reviewers auto assign minor bug by @nirrozenbaum in #575
- fix(scorer): make active request pd aware by @kyanokashi in #569
- test(e2e): cleanup kind cluster by @zdtsw in #563
- refactor: add early validation in DP profile handler by @zdtsw in #554
- deps(go): bump the kubernetes group with 2 updates by @dependabot[bot] in #574
- refactor: kv cache manager repo by @sagearc in #570
- bumping IGW version to the full released version by @kfswain in #583
- Enable prefix-cache awareness in active-active multi-replica scheduler deployments by @vMaroon in #578
- Switch to pre-built vLLM wheels for CPU builds by @sagearc in #582
- update llm-d-kv-cache import to v0.5.0-RC1 by @vMaroon in #584
- Use 1.3.0 CRDs by @shmuelk in #586
Updates in Inference Gateway Extension v1.3.0
llm-d-inference-scheduler v0.5.0 has been updated to use the latest version of the Inference Gateway Extension which is 1.3.0.
You can see those changes here
New Contributors
- @setsunakute made their first contribution in #494
- @evacchi made their first contribution in #503
- @acardace made their first contribution in #512
- @MregXN made their first contribution in #518
- @davidbreitgand made their first contribution in #525
- @kyanokashi made their first contribution in #569
- @sagearc made their first contribution in #570
Full Changelog: v0.4.0-rc.1...v0.5.0-rc.1
v0.4.0
Docker image is available at:
docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.4.0What's Changed
- Use a production version of Istio by @shmuelk in #334
- add vMaroon as code owner by @elevran in #342
- Upgrade
github.com/llm-d/llm-d-kv-cache-managerimport tov0.3.0by @vMaroon in #344 - add a hold label when PRs are pushed to branch other than main by @nirrozenbaum in #345
- sync gic to latest v1.0.0 release by @nirrozenbaum in #353
- deps(actions): bump actions/stale from 9 to 10 by @dependabot[bot] in #350
- deps(actions): bump actions/setup-go from 5 to 6 by @dependabot[bot] in #351
- deps(actions): bump crate-ci/typos from 1.35.7 to 1.36.2 by @dependabot[bot] in #348
- deps(go): bump the go-dependencies group with 7 updates by @dependabot[bot] in #349
- bump llm-d-kv-cache-manager version by @vMaroon in #359
- fix: Rename config to
kv-cache-utilization-scorerfromkv-cache-scorerby @yankay in #358 - updating release issue-template by @kfswain in #361
- bump llm-d-kv-cache-manager version (v0.3.2) by @vMaroon in #365
- deps(go): bump github.com/onsi/ginkgo/v2 from 2.25.3 to 2.26.0 in the go-dependencies group by @dependabot[bot] in #368
- feat: Add a scoring plugin to distribute new groups evenly by @usize in #357
- implement PreRequest and PostResponse interface checks by @learner0810 in #372
- deps(go): bump the kubernetes group with 2 updates by @dependabot[bot] in #369
- deps(go): bump google.golang.org/grpc from 1.75.1 to 1.76.0 in the go-dependencies group by @dependabot[bot] in #374
- Supports the ResponseComplete plugin by @learner0810 in #378
- deps(actions): bump crate-ci/typos from 1.36.2 to 1.38.1 by @dependabot[bot] in #373
- Fix multi-architecture image issues with Kind by @shmuelk in #362
- feat: Moved the Routing Sidecar from its own repo to the inference-scheduler repo by @shmuelk in #379
- Upgrade to use Gateway Inference Extension 1.1.0 rc.1 by @shmuelk in #384
- deps(go): bump github.com/onsi/ginkgo/v2 from 2.26.0 to 2.27.1 in the go-dependencies group by @dependabot[bot] in #389
- Ensure that max_completion_tokens=1 in Prefill by @shmuelk in #403
- Add explanation of inference-scheduler relation to IGW/GIE by @elevran in #393
- Add test coverage to test-unit Makefile target by @carlory in #391
- Add regression tests for max_completion_tokens by @pierDipi in #411
- Makefile refactoring to minimize the number of targets by @shmuelk in #397
- feat: Add vLLM Data Parallel support to llm-d-inference-scheduler by @shmuelk in #392
- fix(scorer): prevent potential division by zero in ActiveRequest.Score by @googs1025 in #413
- Fixed wildcard targets by @shmuelk in #416
- deps(actions): bump crate-ci/typos from 1.38.1 to 1.39.0 by @dependabot[bot] in #419
- deps(go): bump github.com/onsi/ginkgo/v2 from 2.27.1 to 2.27.2 in the go-dependencies group by @dependabot[bot] in #417
- Missed change to the Go code coverage output file names in the Makefile refactoring by @shmuelk in #422
- Fix: Remove reference to the missing make target by @andreyod in #423
- deps(actions): bump actions/upload-artifact from 4 to 5 by @dependabot[bot] in #420
- deps(go): bump sigs.k8s.io/controller-runtime from 0.22.3 to 0.22.4 in the kubernetes group by @dependabot[bot] in #418
- Enhancement: return 503 instead of 502 when decode node is not ready by @Phil-OSophy-42 in #412
- Remove endpointslices from RBAC by @elevran in #424
- Fix Image Loading for Podman in E2E Tests by @hdefazio in #406
- readme meetings update by @nirrozenbaum in #427
- Fix references to the SideCar's tag by @shmuelk in #428
- Remove duplicate error logs by @hyeongyun0916 in #429
- Upgrade to istio-1.28 by @irar2 in #431
- Complete upgrade to Istio 1.28.0 by @shmuelk in #433
- Upgrade GIE dependency to 1.1.0 by @shmuelk in #435
- Remove dev from branch list in PR actions by @elevran in #434
- Added support for Data Parallel in a Disagregated Prefil/Decode setup by @shmuelk in #432
- Remove code coverage from CI workflow by @carlory in #437
- test: Scale up and down the model server during an end to end test by @shmuelk in #354
- fix: add validation in ByLabelFactory to prevent invalid configurations by @googs1025 in #440
- deps(actions): bump golangci/golangci-lint-action from 8 to 9 by @dependabot[bot] in #444
- change lmcache connector to nixlv2 by @googs1025 in #446
- fix: Roll back automatic updates to Dockerfiles by @shmuelk in #447
- deps(go): bump golang.org/x/sync from 0.17.0 to 0.18.0 in the go-dependencies group by @dependabot[bot] in #443
- fix(profile): validate handler parameters to prevent invalid config by @googs1025 in #449
- Added chat completions preprocessing support by @guygir in #426
- docs: add integration guide for external prefill/decode workloads by @googs1025 in #451
- Define and manage PR lifecycle by @elevran in #450
- test: End to End test for Data Parallel support by @shmuelk in #442
- docs: add PD-aware examples for by-label and by-label-selector plugins by @googs1025 in #454
- deps(actions): bump crate-ci/typos from 1.39.0 to 1.39.2 by @dependabot[bot] in #459
- Add SGLang Connector for Prefill/Decode Disaggregation (migrated from llm-d-routing-sidecar#64) by @bongwoobak in #456
- deps(go): bump the kubernetes group with 4 updates by @dependabot[bot] in #460
- add unit test in scheduler plugin part(by-label, data-parallel-profile-handler, pd-profile-handler) by @googs1025 in #461
- test: Enable running the end to end tests on K8S clusters other than Kind by @shmuelk in #453
- Allow the sidecar to sample from a list of prefill host ports by @smarterclayton in #404
- fix: Fixed issues running locally 'make lint' and 'make test-unit' by @shmuelk in #464
- cleanup: Followup to Python paths fix by @shmuelk in #468
- Replace tab with spaces to avoid treating as make target by @elevran in #469
- minor refactoring of
precise-prefix-cachescorer plugin by @vMaroon in #473 - feat: Add initial metrics and update dependencies by...
v0.4.0-rc.1
Docker image is available here:
docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.4.0-rc.1
What's Changed
- Use a production version of Istio by @shmuelk in #334
- add vMaroon as code owner by @elevran in #342
- Upgrade
github.com/llm-d/llm-d-kv-cache-managerimport tov0.3.0by @vMaroon in #344 - add a hold label when PRs are pushed to branch other than main by @nirrozenbaum in #345
- sync gic to latest v1.0.0 release by @nirrozenbaum in #353
- deps(actions): bump actions/stale from 9 to 10 by @dependabot[bot] in #350
- deps(actions): bump actions/setup-go from 5 to 6 by @dependabot[bot] in #351
- deps(actions): bump crate-ci/typos from 1.35.7 to 1.36.2 by @dependabot[bot] in #348
- deps(go): bump the go-dependencies group with 7 updates by @dependabot[bot] in #349
- bump llm-d-kv-cache-manager version by @vMaroon in #359
- fix: Rename config to
kv-cache-utilization-scorerfromkv-cache-scorerby @yankay in #358 - updating release issue-template by @kfswain in #361
- bump llm-d-kv-cache-manager version (v0.3.2) by @vMaroon in #365
- deps(go): bump github.com/onsi/ginkgo/v2 from 2.25.3 to 2.26.0 in the go-dependencies group by @dependabot[bot] in #368
- feat: Add a scoring plugin to distribute new groups evenly by @usize in #357
- implement PreRequest and PostResponse interface checks by @learner0810 in #372
- deps(go): bump the kubernetes group with 2 updates by @dependabot[bot] in #369
- deps(go): bump google.golang.org/grpc from 1.75.1 to 1.76.0 in the go-dependencies group by @dependabot[bot] in #374
- Supports the ResponseComplete plugin by @learner0810 in #378
- deps(actions): bump crate-ci/typos from 1.36.2 to 1.38.1 by @dependabot[bot] in #373
- Fix multi-architecture image issues with Kind by @shmuelk in #362
- feat: Moved the Routing Sidecar from its own repo to the inference-scheduler repo by @shmuelk in #379
- Upgrade to use Gateway Inference Extension 1.1.0 rc.1 by @shmuelk in #384
- deps(go): bump github.com/onsi/ginkgo/v2 from 2.26.0 to 2.27.1 in the go-dependencies group by @dependabot[bot] in #389
- Ensure that max_completion_tokens=1 in Prefill by @shmuelk in #403
- Add explanation of inference-scheduler relation to IGW/GIE by @elevran in #393
- Add test coverage to test-unit Makefile target by @carlory in #391
- Add regression tests for max_completion_tokens by @pierDipi in #411
- Makefile refactoring to minimize the number of targets by @shmuelk in #397
- feat: Add vLLM Data Parallel support to llm-d-inference-scheduler by @shmuelk in #392
- fix(scorer): prevent potential division by zero in ActiveRequest.Score by @googs1025 in #413
- Fixed wildcard targets by @shmuelk in #416
- deps(actions): bump crate-ci/typos from 1.38.1 to 1.39.0 by @dependabot[bot] in #419
- deps(go): bump github.com/onsi/ginkgo/v2 from 2.27.1 to 2.27.2 in the go-dependencies group by @dependabot[bot] in #417
- Missed change to the Go code coverage output file names in the Makefile refactoring by @shmuelk in #422
- Fix: Remove reference to the missing make target by @andreyod in #423
- deps(actions): bump actions/upload-artifact from 4 to 5 by @dependabot[bot] in #420
- deps(go): bump sigs.k8s.io/controller-runtime from 0.22.3 to 0.22.4 in the kubernetes group by @dependabot[bot] in #418
- Enhancement: return 503 instead of 502 when decode node is not ready by @Phil-OSophy-42 in #412
- Remove endpointslices from RBAC by @elevran in #424
- Fix Image Loading for Podman in E2E Tests by @hdefazio in #406
- readme meetings update by @nirrozenbaum in #427
- Fix references to the SideCar's tag by @shmuelk in #428
- Remove duplicate error logs by @hyeongyun0916 in #429
- Upgrade to istio-1.28 by @irar2 in #431
- Complete upgrade to Istio 1.28.0 by @shmuelk in #433
- Upgrade GIE dependency to 1.1.0 by @shmuelk in #435
- Remove dev from branch list in PR actions by @elevran in #434
- Added support for Data Parallel in a Disagregated Prefil/Decode setup by @shmuelk in #432
- Remove code coverage from CI workflow by @carlory in #437
- test: Scale up and down the model server during an end to end test by @shmuelk in #354
- fix: add validation in ByLabelFactory to prevent invalid configurations by @googs1025 in #440
- deps(actions): bump golangci/golangci-lint-action from 8 to 9 by @dependabot[bot] in #444
- change lmcache connector to nixlv2 by @googs1025 in #446
- fix: Roll back automatic updates to Dockerfiles by @shmuelk in #447
- deps(go): bump golang.org/x/sync from 0.17.0 to 0.18.0 in the go-dependencies group by @dependabot[bot] in #443
- fix(profile): validate handler parameters to prevent invalid config by @googs1025 in #449
- Added chat completions preprocessing support by @guygir in #426
- docs: add integration guide for external prefill/decode workloads by @googs1025 in #451
- Define and manage PR lifecycle by @elevran in #450
- test: End to End test for Data Parallel support by @shmuelk in #442
- docs: add PD-aware examples for by-label and by-label-selector plugins by @googs1025 in #454
- deps(actions): bump crate-ci/typos from 1.39.0 to 1.39.2 by @dependabot[bot] in #459
- Add SGLang Connector for Prefill/Decode Disaggregation (migrated from llm-d-routing-sidecar#64) by @bongwoobak in #456
- deps(go): bump the kubernetes group with 4 updates by @dependabot[bot] in #460
- add unit test in scheduler plugin part(by-label, data-parallel-profile-handler, pd-profile-handler) by @googs1025 in #461
- test: Enable running the end to end tests on K8S clusters other than Kind by @shmuelk in #453
- Allow the sidecar to sample from a list of prefill host ports by @smarterclayton in #404
- fix: Fixed issues running locally 'make lint' and 'make test-unit' by @shmuelk in #464
- cleanup: Followup to Python paths fix by @shmuelk in #468
- Replace tab with spaces to avoid treating as make target by @elevran in #469
- minor refactoring of
precise-prefix-cachescorer plugin by @vMaroon in #473 - feat: Add initial metrics and update dependencies...
v0.3.2
In addition to the below changes these patches include fixes to the kv-cache-manager dependency
What's Changed
- bump gie version to v0.5.0 by @nirrozenbaum in #256
- Fixes for Steps to Build a Kubernetes Development Environment by @dumb0002 in #259
- bump gie version to v0.5.1 rc1 (patch release) by @nirrozenbaum in #262
- added badges to readme by @nirrozenbaum in #261
- chore: bump gie version by @nirrozenbaum in #263
- Update
prefix-cache-scorerConfiguration Doc Entry by @vMaroon in #264 - Update Tokenizer Release Version by @vMaroon in #265
- #245 Followup - Makefile Installs
libzmqDependency by @vMaroon in #266 - deps(go): bump the go-dependencies group with 3 updates by @dependabot[bot] in #257
- Add codespell integration for spell checking by @Jooho in #221
- Initial CODEOWNERS file by @elevran in #267
- added issues templates by @nirrozenbaum in #272
- deps(actions): bump crate-ci/typos from 1.34.0 to 1.35.1 by @dependabot[bot] in #275
- small updates to documentation by @kfswain in #277
- Change CI to only create a latest tagged image on releases by @shmuelk in #278
- fix: correct shell command substitution syntax in Makefile by @yankay in #276
- add reference to writing a new plugin by @elevran in #280
- deps(actions): bump crate-ci/typos from 1.35.1 to 1.35.3 by @dependabot[bot] in #282
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #283
- Remove non-existent labels from dependabot configuration by @elevran in #285
- chore: ⬆️ bump components gie to v0.5.1. by @yafengio in #288
- Add workflows to automate aged issues management by @anoruxylene in #289
- update load aware scorer unit tests by @jairuigou in #291
- Add build metadata to Docker image by @carlory in #292
- Add Prow GitHub Actions by @Jooho in #290
- remove setup log by @carlory in #294
- OWNERS_ALIASES is not working with current automation by @nirrozenbaum in #296
- deps(actions): bump crate-ci/typos from 1.35.3 to 1.35.4 by @dependabot[bot] in #299
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #300
- deps(go): bump the go-dependencies group with 2 updates by @dependabot[bot] in #301
- fix: Makefile fixes for MacOS by @shmuelk in #304
- Update comment in prow-pr-automerge.yml by @elevran in #303
- Remove github CODEOWNERS by @elevran in #306
- deps(docker): bump projectquay/golang from 1.24 to 1.25 by @dependabot[bot] in #302
- Support for installation of inference-sim with kv-cache enabled by @mayabar in #305
- add tokenizer directory to simulator deployment yaml by @mayabar in #307
- [feature] Added Active-Request-Scorer by @vMaroon in #297
- deps(actions): bump crate-ci/typos from 1.35.4 to 1.35.5 by @dependabot[bot] in #316
- deps(go): bump github.com/stretchr/testify from 1.10.0 to 1.11.0 in the go-dependencies group by @dependabot[bot] in #314
- chore: Added end to end tests by @shmuelk in #310
- ByLabelSelector filter tests by @elevran in #315
- chore: drop plugin type from types and file by @yyzxw in #308
- fixed a bug where typos check is checking go.mod and go.sum by @nirrozenbaum in #321
- minor tweaks to Makefile by @nirrozenbaum in #318
- Split the
prefix-cache-scorerplugins by @vMaroon in #323 - Makefile fixes by @vMaroon in #322
- deps(actions): bump crate-ci/typos from 1.35.5 to 1.35.7 by @dependabot[bot] in #325
- deps(go): bump the go-dependencies group with 9 updates by @dependabot[bot] in #326
- Reduce img size by cleaning dnf cache by @rawagner in #328
- added comment to stale issues by @nirrozenbaum in #327
- fixed missing dependencies in makefile by @nirrozenbaum in #330
- sync with IGW release 1.0.0-rc by @nirrozenbaum in #320
- fix: make env-dev-kind fails after sync with IGW 1.0.0-rc-3 by @shmuelk in #331
New Contributors
- @dumb0002 made their first contribution in #259
- @yafengio made their first contribution in #288
- @anoruxylene made their first contribution in #289
- @jairuigou made their first contribution in #291
- @yyzxw made their first contribution in #308
- @rawagner made their first contribution in #328
Full Changelog: v0.2.1...v0.3.2
v0.3.2-rc.1
Small fixes to kv-cache-manager required updated dependencies
v0.3.1
Small patch updating kv cache manager dependency to include support in v0.3
See the full v0.3 changes here:
What's Changed
- bump gie version to v0.5.0 by @nirrozenbaum in #256
- Fixes for Steps to Build a Kubernetes Development Environment by @dumb0002 in #259
- bump gie version to v0.5.1 rc1 (patch release) by @nirrozenbaum in #262
- added badges to readme by @nirrozenbaum in #261
- chore: bump gie version by @nirrozenbaum in #263
- Update
prefix-cache-scorerConfiguration Doc Entry by @vMaroon in #264 - Update Tokenizer Release Version by @vMaroon in #265
- #245 Followup - Makefile Installs
libzmqDependency by @vMaroon in #266 - deps(go): bump the go-dependencies group with 3 updates by @dependabot[bot] in #257
- Add codespell integration for spell checking by @Jooho in #221
- Initial CODEOWNERS file by @elevran in #267
- added issues templates by @nirrozenbaum in #272
- deps(actions): bump crate-ci/typos from 1.34.0 to 1.35.1 by @dependabot[bot] in #275
- small updates to documentation by @kfswain in #277
- Change CI to only create a latest tagged image on releases by @shmuelk in #278
- fix: correct shell command substitution syntax in Makefile by @yankay in #276
- add reference to writing a new plugin by @elevran in #280
- deps(actions): bump crate-ci/typos from 1.35.1 to 1.35.3 by @dependabot[bot] in #282
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #283
- Remove non-existent labels from dependabot configuration by @elevran in #285
- chore: ⬆️ bump components gie to v0.5.1. by @yafengio in #288
- Add workflows to automate aged issues management by @anoruxylene in #289
- update load aware scorer unit tests by @jairuigou in #291
- Add build metadata to Docker image by @carlory in #292
- Add Prow GitHub Actions by @Jooho in #290
- remove setup log by @carlory in #294
- OWNERS_ALIASES is not working with current automation by @nirrozenbaum in #296
- deps(actions): bump crate-ci/typos from 1.35.3 to 1.35.4 by @dependabot[bot] in #299
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #300
- deps(go): bump the go-dependencies group with 2 updates by @dependabot[bot] in #301
- fix: Makefile fixes for MacOS by @shmuelk in #304
- Update comment in prow-pr-automerge.yml by @elevran in #303
- Remove github CODEOWNERS by @elevran in #306
- deps(docker): bump projectquay/golang from 1.24 to 1.25 by @dependabot[bot] in #302
- Support for installation of inference-sim with kv-cache enabled by @mayabar in #305
- add tokenizer directory to simulator deployment yaml by @mayabar in #307
- [feature] Added Active-Request-Scorer by @vMaroon in #297
- deps(actions): bump crate-ci/typos from 1.35.4 to 1.35.5 by @dependabot[bot] in #316
- deps(go): bump github.com/stretchr/testify from 1.10.0 to 1.11.0 in the go-dependencies group by @dependabot[bot] in #314
- chore: Added end to end tests by @shmuelk in #310
- ByLabelSelector filter tests by @elevran in #315
- chore: drop plugin type from types and file by @yyzxw in #308
- fixed a bug where typos check is checking go.mod and go.sum by @nirrozenbaum in #321
- minor tweaks to Makefile by @nirrozenbaum in #318
- Split the
prefix-cache-scorerplugins by @vMaroon in #323 - Makefile fixes by @vMaroon in #322
- deps(actions): bump crate-ci/typos from 1.35.5 to 1.35.7 by @dependabot[bot] in #325
- deps(go): bump the go-dependencies group with 9 updates by @dependabot[bot] in #326
- Reduce img size by cleaning dnf cache by @rawagner in #328
- added comment to stale issues by @nirrozenbaum in #327
- fixed missing dependencies in makefile by @nirrozenbaum in #330
- sync with IGW release 1.0.0-rc by @nirrozenbaum in #320
- fix: make env-dev-kind fails after sync with IGW 1.0.0-rc-3 by @shmuelk in #331
New Contributors
- @dumb0002 made their first contribution in #259
- @kfswain made their first contribution in #277
- @yafengio made their first contribution in #288
- @anoruxylene made their first contribution in #289
- @jairuigou made their first contribution in #291
- @yyzxw made their first contribution in #308
- @rawagner made their first contribution in #328
Full Changelog: v0.2.1...v0.3.1
v0.3.1-rc.1
Full Changelog: v0.3.0...v0.3.1-rc.1
v0.3.0
Image pull example: docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.3.0
What's Changed
- bump gie version to v0.5.0 by @nirrozenbaum in #256
- Fixes for Steps to Build a Kubernetes Development Environment by @dumb0002 in #259
- bump gie version to v0.5.1 rc1 (patch release) by @nirrozenbaum in #262
- added badges to readme by @nirrozenbaum in #261
- chore: bump gie version by @nirrozenbaum in #263
- Update
prefix-cache-scorerConfiguration Doc Entry by @vMaroon in #264 - Update Tokenizer Release Version by @vMaroon in #265
- #245 Followup - Makefile Installs
libzmqDependency by @vMaroon in #266 - deps(go): bump the go-dependencies group with 3 updates by @dependabot[bot] in #257
- Add codespell integration for spell checking by @Jooho in #221
- Initial CODEOWNERS file by @elevran in #267
- added issues templates by @nirrozenbaum in #272
- deps(actions): bump crate-ci/typos from 1.34.0 to 1.35.1 by @dependabot[bot] in #275
- small updates to documentation by @kfswain in #277
- Change CI to only create a latest tagged image on releases by @shmuelk in #278
- fix: correct shell command substitution syntax in Makefile by @yankay in #276
- add reference to writing a new plugin by @elevran in #280
- deps(actions): bump crate-ci/typos from 1.35.1 to 1.35.3 by @dependabot[bot] in #282
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #283
- Remove non-existent labels from dependabot configuration by @elevran in #285
- chore: ⬆️ bump components gie to v0.5.1. by @yafengio in #288
- Add workflows to automate aged issues management by @anoruxylene in #289
- update load aware scorer unit tests by @jairuigou in #291
- Add build metadata to Docker image by @carlory in #292
- Add Prow GitHub Actions by @Jooho in #290
- remove setup log by @carlory in #294
- OWNERS_ALIASES is not working with current automation by @nirrozenbaum in #296
- deps(actions): bump crate-ci/typos from 1.35.3 to 1.35.4 by @dependabot[bot] in #299
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #300
- deps(go): bump the go-dependencies group with 2 updates by @dependabot[bot] in #301
- fix: Makefile fixes for MacOS by @shmuelk in #304
- Update comment in prow-pr-automerge.yml by @elevran in #303
- Remove github CODEOWNERS by @elevran in #306
- deps(docker): bump projectquay/golang from 1.24 to 1.25 by @dependabot[bot] in #302
- Support for installation of inference-sim with kv-cache enabled by @mayabar in #305
- add tokenizer directory to simulator deployment yaml by @mayabar in #307
- [feature] Added Active-Request-Scorer by @vMaroon in #297
- deps(actions): bump crate-ci/typos from 1.35.4 to 1.35.5 by @dependabot[bot] in #316
- deps(go): bump github.com/stretchr/testify from 1.10.0 to 1.11.0 in the go-dependencies group by @dependabot[bot] in #314
- chore: Added end to end tests by @shmuelk in #310
- ByLabelSelector filter tests by @elevran in #315
- chore: drop plugin type from types and file by @yyzxw in #308
- fixed a bug where typos check is checking go.mod and go.sum by @nirrozenbaum in #321
- minor tweaks to Makefile by @nirrozenbaum in #318
- Split the
prefix-cache-scorerplugins by @vMaroon in #323 - Makefile fixes by @vMaroon in #322
- deps(actions): bump crate-ci/typos from 1.35.5 to 1.35.7 by @dependabot[bot] in #325
- deps(go): bump the go-dependencies group with 9 updates by @dependabot[bot] in #326
- Reduce img size by cleaning dnf cache by @rawagner in #328
- added comment to stale issues by @nirrozenbaum in #327
- fixed missing dependencies in makefile by @nirrozenbaum in #330
- sync with IGW release 1.0.0-rc by @nirrozenbaum in #320
- fix: make env-dev-kind fails after sync with IGW 1.0.0-rc-3 by @shmuelk in #331
New Contributors
- @dumb0002 made their first contribution in #259
- @kfswain made their first contribution in #277
- @yankay made their first contribution in #276
- @yafengio made their first contribution in #288
- @anoruxylene made their first contribution in #289
- @jairuigou made their first contribution in #291
- @yyzxw made their first contribution in #308
- @rawagner made their first contribution in #328
Full Changelog: v0.2.1...v0.3.0
v0.3.0-rc.2
Image is available here: docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.3.0-rc.2