What's Changed
- Revert "Replace EndpointSlice reconciler with pod list backed by informers" by @kfswain in #301
- Fixing small linter complaints by @kfswain in #302
- In hermetic test, add additional test cases and move k8sClient object creation so it's called once for all tests by @BenjaminBraunDev in #278
- [Metrics] Add average kv cache and waiting queue size metrics for inference pool by @JeffLuoo in #304
- Move getting started guide to docs site by @kfswain in #308
- site-source: Fix 'Bakcground' misspell in API concepts page by @timflannagan in #309
- Mkdocs fixes by @kfswain in #314
- Bump google.golang.org/protobuf from 1.36.4 to 1.36.5 by @dependabot in #315
- Remove gci linter by @ahg-g in #317
- fix: adds ErrorNotFound Handling for InferenceModel Reconciler by @danehans in #286
- site-src: Replace k8sgateway with kgateway & fix spelling in roles-and-personas.md by @timflannagan in #311
- Fix: Go Mod Imports by @danehans in #318
- Updates EPP Deployment and Release Doc/Script by @danehans in #322
- Delete InferenceModels from the datastore when deletionTimestamp is set by @ahg-g in #319
- Actually init logging using Zap by @tchap in #267
- Remove fatal log calls in executable code by @tchap in #265
- feat: Adds e2e test script by @danehans in #294
- Replacing endpointSlice Reconciler with a direct Pod Reconciler by @kfswain in #300
- Move manager from runserver to main by @tchap in #331
- feat: adds image-load and kind-load Make targets by @danehans in #288
- Use structured logging by @tchap in #330
- Add TLS support with self-signed certificate. by @ahg-g in #335
- Lora syncer docs by @coolkp in #320
- Fix cloudbuild rule for the LoRA syncer image by @ahg-g in #339
- fix: Corrects release branch naming by @danehans in #333
- Use contextual logging by @tchap in #337
- Bump the kubernetes group with 6 updates by @dependabot in #351
- Bump sigs.k8s.io/controller-runtime from 0.20.1 to 0.20.2 by @dependabot in #352
- Fixes to the adapter rollouts guide by @ahg-g in #338
- Consolidating all storage behind datastore by @ahg-g in #350
- fixed a typo - close a bash markdown by @nirrozenbaum in #364
- Added controller and datastore package by @hzxuzhonghu in #363
- Move pkg/ext-proc -> cmd/ext-proc by @tchap in #368
- added license header to all .go files by @nirrozenbaum in #370
- fix inference extension not correctly scrape pod metrics by @Kuromesi in #366
- Move pkg/manifests -> config/manifests by @tchap in #371
- [Metrics] Add request error metrics by @JeffLuoo in #269
- Rename pkg/ext-proc to pkg/epp by @tchap in #372
- Move pkg/ext-proc/metrics/README.md -> site-src/guides/metrics.md by @courageJ in #373
- Defining an outer metadata struct as part of the extproc endpoint picking protocol by @ahg-g in #377
- Draft a revised README.md by @smarterclayton in #374
- Add README.md file to the epp pkg by @ahg-g in #386
- Split the proxy and model server protocols for easy reference by @ahg-g in #387
- [Metric] Add inference pool and request error metrics to the dashboard by @JeffLuoo in #389
- Switch to gcr.io/distroless/static:nonroot base image by @ahg-g in #384
- fix context canceled recv error handling by @Kuromesi in #390
- Added endpoint picker diagram by @ahg-g in #396
- Added v1alpha2 api by @hzxuzhonghu in #398
- Adding a roadmap to README by @kfswain in #400
- Bump github.com/prometheus/client_golang from 1.20.5 to 1.21.0 by @dependabot in #402
- Bump github.com/google/go-cmp from 0.6.0 to 0.7.0 by @dependabot in #403
- updated logging in inferencepool reconciler by @nirrozenbaum in #399
- added inferencemodel predicate + minor changes in logging by @nirrozenbaum in #397
- Syncing getting started guide all to main by @kfswain in #410
- fixed typo in filepath in website guide page by @nirrozenbaum in #412
- Fix InferenceModel deletion logic by @ahg-g in #393
- Updated yamls to use v1alpha2 by @ahg-g in #420
- Rm v1alpha1 api by @hzxuzhonghu in #405
- removed the EndpointPickerNotHealthy condition form pool status by @ahg-g in #421
- [Metrics] Add metrics validation in integration test by @JeffLuoo in #413
- predicate follow up PR to remove the check from Reconcile func by @nirrozenbaum in #418
- Mis cleanup by @hzxuzhonghu in #428
- fix metric scrape port not updated when inference pool target port updated by @Kuromesi in #417
- make ModelName immutable and fix model weight by @hzxuzhonghu in #427
- Consistent validation for reference types by @robscott in #430
- create pods during integration tests by @Kuromesi in #431
- fix typos by @nirrozenbaum in #433
- Adding Accepted and ResolvedRefs conditions to InferencePool by @robscott in #446
- Add code for Envoy extension that supports body-to-header translation by @rramkumar1 in #355
- Add Makefile + cloudbuild configs for body-based routing extension by @rramkumar1 in #442
- added cpu based example by @nirrozenbaum in #436
- updated cleanup section in quickstart by @nirrozenbaum in #448
- scheduling changes for lora affinity load balancing by @kaushikmitr in #423
- Fixing default status on InferencePool by @robscott in #449
- Use server side namespace filter by @hzxuzhonghu in #429
- fixed filepath that points to gpu based model server deployment in few places by @nirrozenbaum in #451
- Add library for generating self-signed cert by @rramkumar1 in #453
- Support full duplex streaming by @kfswain in #450
- Renaming conditions and reasons used in InferencePool status by @robscott in #454
- Move integration and e2e tests for epp into epp-specific directories by @rramkumar1 in #457
- Add initial integration test for body-based routing extension by @rramkumar1 in #458
- Each pod has independent loops to refresh metrics by @liu-cong in #460
- fixed broken link in README by @nirrozenbaum in #467
- fixed minimal requirement for envoy version by @nirrozenbaum in #466
- Bump github.com/onsi/ginkgo/v2 from 2.22.2 to 2.23.0 by @dependabot in #473
- Bump sigs.k8s.io/controller-runtime from 0.20.2 to 0.20.3 by @dependabot in #470
- Bump google.golang.org/grpc from 1.70.0 to 1.71.0 by @dependabot in #471
- Bump github.com/prometheus/client_golang from 1.21.0 to 1.21.1 by @dependabot in #474
- Bump sigs.k8s.io/structured-merge-diff/v4 from 4.5.0 to 4.6.0 by @dependabot in #472
- [BBR] Fix bug where request trailers were not being handled by @rramkumar1 in #477
- Add the base model to InferenceModel sample manifest by @liu-cong in #479
- Fix metrics debug log; change metrics client log level to reduce spam by @liu-cong in #478
- Add support for OpenAI API streaming protocol by @kfswain in #469
New Contributors
- @timflannagan made their first contribution in #309
- @nirrozenbaum made their first contribution in #364
- @hzxuzhonghu made their first contribution in #363
- @Kuromesi made their first contribution in #366
Full Changelog: v0.1.0...v0.2.0