Release GuideLLM v0.6.0 · vllm-project/guidellm

Overview

GuideLLM v0.6.0 is a feature release adding multi-turn, Responses API, GeoSpatial model support, and in-process vLLM Python backend along with bug fixes.

To get started, install with:

pip install guidellm[recommended]==0.6.0

Or from source with:

pip install 'guidellm[recommended] @ git+https://github.com/vllm-project/guidellm.git'@v0.6.0

Compatibility Notes

Python: 3.10–3.13
OS: Linux, MacOS

What's New:

Added basic Responses API support: tool calling support will be added later
Added multi-turn support for both datasets and synthetic data
Added vLLM Python (in-process) backend
Added TerraTorch GeoSpacial model support

What's Fixed:

Allow disabling vLLM-specific body options in HTTP backend
Fix --sample-requests to limit sampling in output
Fix HTML references in html report
Fixed container image HOME permissions for OpenShift

Change Log

Features

Instant ttft oversaturation by @ushaket in #607
vLLM Python Backend by @jaredoconnell in #596
Multiturn Benchmarking by @sjmonson in #590
Add turn and conversation trackers to the request by @sjmonson in #649
Basic Responses API Support by @jaredoconnell in #655
Add support for TerraTorch Geospatial models served via the vLLM /pooling endpoint by @mgazz in #610

Internal refactoring & cleanup

Fix and improve the mock server by @jaredoconnell in #640
Import utils from sub-submodules by @sjmonson in #644
Move request formatting to backend by @sjmonson in #478
Drop median from throughput metrics in console by @sjmonson in #617
Fix HTML on main and disable by default by @sjmonson in #638
Drop vLLM extras group by @sjmonson in #635
Improve environment variables warning and validation cleanup by @sjmonson in #654
Pass mp context to strategy by @sjmonson in #651
Cleanup init by @sjmonson in #647

Fixes

fix(cli): validate --output-path against --output-dir by @aiwantaozi in #561
Fix the guidellm benchmark --sample-requests command line option by @natoscott in #591
Drop various depricated settings and remove the default OpenAI request timeout by @sjmonson in #589
Fix /v1/chat/completions formatting by @sjmonson in #595
Containerfile: ensure that HOME can be used by any user ID by @kpouget in #601
Fix JSON serialization for binary request payloads via base64 bytes config by @ushaket in #612
Move html template source location to raw github by @sjmonson in #629
Fix file extension not being sent to output handler by @jaredoconnell in #639
Check if deserialization path is vaild safely by @sjmonson in #659
Support removing keys from HTTP request bodies by @sjmonson in #661
Replace line iter with bytes to lines wrapper by @sjmonson in #663
Revert back to iterating over lines by @sjmonson in #680

CI environment

Fix multiple main CI failures by @sjmonson in #578
Replace exisiting issue templates with form versions by @sjmonson in #586
Add merge policies by @sjmonson in #630
Run format job in CI by @sjmonson in #625
Drop RC jobs and nightly PyPi publish by @sjmonson in #632
Mark requeue test as xfail due to uvloop bug by @sjmonson in #652
Lock all GitHub Actions to SHA by @dbutenhof in #666
Identify action versions by @dbutenhof in #679
Remove "uv" ecosystem from yaml by @dbutenhof in #681

Documentation

Add multimodal benchmarking usage docs by @markurtz in #568
Add data parameter to benchmark command in README by @S1ro1 in #616
Add detail in benchmark profile documentation by @dbutenhof in #619
docs: add documentation for passing sampling parameters via --backend-kwargs by @cemigo114 in #626
docs: Fixing a broken link of docs/guides/outputs.md. by @theodor2311 in #642

Dependency updates

Fixup pylock after dependabot PRs by @sjmonson in #553
Fix for Dependabot action by @sjmonson in #554
Drop pylock by @sjmonson in #555
Bump virtualenv from 20.35.4 to 20.36.1 by @dependabot[bot] in #544
Bump urllib3 from 2.5.0 to 2.6.3 by @dependabot[bot] in #545
Bump aiohttp from 3.13.2 to 3.13.3 by @dependabot[bot] in #547
Bump protobuf from 6.33.1 to 6.33.5 by @dependabot[bot] in #580
Bump pillow from 12.0.0 to 12.1.1 by @dependabot[bot] in #593
Bump torchcodec (and torch) by @sjmonson in #614
Bump transformers version in lock by @sjmonson in #628
Bump orjson from 3.11.4 to 3.11.6 by @dependabot[bot] in #631
Bump ujson from 5.11.0 to 5.12.0 by @dependabot[bot] in #643
Require datasets 4.1.0 by @dbutenhof in #650
Bump requests from 2.32.5 to 2.33.0 by @dependabot[bot] in #657

New Contributors

@aiwantaozi made their first contribution in #561
@kpouget made their first contribution in #601
@ushaket made their first contribution in #612
@S1ro1 made their first contribution in #616
@dbutenhof made their first contribution in #619
@cemigo114 made their first contribution in #626
@theodor2311 made their first contribution in #642
@mgazz made their first contribution in #610

Full Changelog: v0.5.3...v0.6.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GuideLLM v0.6.0

Choose a tag to compare

Sorry, something went wrong.