·
38 commits
to main
since this release
Immutable
release. Only release title and notes can be modified.
Overview
GuideLLM v0.6.0 is a feature release adding multi-turn, Responses API, GeoSpatial model support, and in-process vLLM Python backend along with bug fixes.
To get started, install with:
pip install guidellm[recommended]==0.6.0Or from source with:
pip install 'guidellm[recommended] @ git+https://github.com/vllm-project/guidellm.git'@v0.6.0Compatibility Notes
- Python: 3.10–3.13
- OS: Linux, MacOS
What's New:
- Added basic Responses API support: tool calling support will be added later
- Added multi-turn support for both datasets and synthetic data
- Added vLLM Python (in-process) backend
- Added TerraTorch GeoSpacial model support
What's Fixed:
- Allow disabling vLLM-specific body options in HTTP backend
- Fix
--sample-requeststo limit sampling in output - Fix HTML references in html report
- Fixed container image HOME permissions for OpenShift
Change Log
Features
- Instant ttft oversaturation by @ushaket in #607
- vLLM Python Backend by @jaredoconnell in #596
- Multiturn Benchmarking by @sjmonson in #590
- Add turn and conversation trackers to the request by @sjmonson in #649
- Basic Responses API Support by @jaredoconnell in #655
- Add support for TerraTorch Geospatial models served via the vLLM /pooling endpoint by @mgazz in #610
Internal refactoring & cleanup
- Fix and improve the mock server by @jaredoconnell in #640
- Import utils from sub-submodules by @sjmonson in #644
- Move request formatting to backend by @sjmonson in #478
- Drop median from throughput metrics in console by @sjmonson in #617
- Fix HTML on main and disable by default by @sjmonson in #638
- Drop vLLM extras group by @sjmonson in #635
- Improve environment variables warning and validation cleanup by @sjmonson in #654
- Pass mp context to strategy by @sjmonson in #651
- Cleanup init by @sjmonson in #647
Fixes
- fix(cli): validate --output-path against --output-dir by @aiwantaozi in #561
- Fix the guidellm benchmark --sample-requests command line option by @natoscott in #591
- Drop various depricated settings and remove the default OpenAI request timeout by @sjmonson in #589
- Fix /v1/chat/completions formatting by @sjmonson in #595
- Containerfile: ensure that HOME can be used by any user ID by @kpouget in #601
- Fix JSON serialization for binary request payloads via base64 bytes config by @ushaket in #612
- Move html template source location to raw github by @sjmonson in #629
- Fix file extension not being sent to output handler by @jaredoconnell in #639
- Check if deserialization path is vaild safely by @sjmonson in #659
- Support removing keys from HTTP request bodies by @sjmonson in #661
- Replace line iter with bytes to lines wrapper by @sjmonson in #663
- Revert back to iterating over lines by @sjmonson in #680
CI environment
- Fix multiple main CI failures by @sjmonson in #578
- Replace exisiting issue templates with form versions by @sjmonson in #586
- Add merge policies by @sjmonson in #630
- Run format job in CI by @sjmonson in #625
- Drop RC jobs and nightly PyPi publish by @sjmonson in #632
- Mark requeue test as xfail due to uvloop bug by @sjmonson in #652
- Lock all GitHub Actions to SHA by @dbutenhof in #666
- Identify action versions by @dbutenhof in #679
- Remove "uv" ecosystem from yaml by @dbutenhof in #681
Documentation
- Add multimodal benchmarking usage docs by @markurtz in #568
- Add data parameter to benchmark command in README by @S1ro1 in #616
- Add detail in benchmark profile documentation by @dbutenhof in #619
- docs: add documentation for passing sampling parameters via --backend-kwargs by @cemigo114 in #626
- docs: Fixing a broken link of docs/guides/outputs.md. by @theodor2311 in #642
Dependency updates
- Fixup pylock after dependabot PRs by @sjmonson in #553
- Fix for Dependabot action by @sjmonson in #554
- Drop pylock by @sjmonson in #555
- Bump virtualenv from 20.35.4 to 20.36.1 by @dependabot[bot] in #544
- Bump urllib3 from 2.5.0 to 2.6.3 by @dependabot[bot] in #545
- Bump aiohttp from 3.13.2 to 3.13.3 by @dependabot[bot] in #547
- Bump protobuf from 6.33.1 to 6.33.5 by @dependabot[bot] in #580
- Bump pillow from 12.0.0 to 12.1.1 by @dependabot[bot] in #593
- Bump torchcodec (and torch) by @sjmonson in #614
- Bump transformers version in lock by @sjmonson in #628
- Bump orjson from 3.11.4 to 3.11.6 by @dependabot[bot] in #631
- Bump ujson from 5.11.0 to 5.12.0 by @dependabot[bot] in #643
- Require datasets 4.1.0 by @dbutenhof in #650
- Bump requests from 2.32.5 to 2.33.0 by @dependabot[bot] in #657
New Contributors
- @aiwantaozi made their first contribution in #561
- @kpouget made their first contribution in #601
- @ushaket made their first contribution in #612
- @S1ro1 made their first contribution in #616
- @dbutenhof made their first contribution in #619
- @cemigo114 made their first contribution in #626
- @theodor2311 made their first contribution in #642
- @mgazz made their first contribution in #610
Full Changelog: v0.5.3...v0.6.0