Skip to content

List view

  • This release of GuideLLM focuses on finalizing the CI/CD pipelines for full automation, expanding documentation for easier access, enhancing user experience through a web-based report UI, and adding backend support for benchmarking across different hardware configurations. ## Key Features - CI/CD Finalization: Finalize and expand GitHub Actions CI/CD pipelines to enable automated builds, releases, testing, and quality assurance. - Documentation Expansion: Expand the documentation and host it on a dedicated webpage, covering CLI, examples, and architecture for easy discovery. - GuideLLM HTML Report UI: Include an HTML report UI for easier visualization and consumption of benchmark results. - vLLM Python Backend Integration: Integrate the vLLM backend for direct benchmarking, including system hardware reporting. - Standard Dataset Profiles: Standardized Dataset profiles make it easy to run inference perforamnce benchmarks for your expected token input/output profiles across key LLM use cases. - Transformers/Compressed Tensors Backend Support: Add benchmarking support for transformers and compressed tensors with detailed hardware reporting. - vLLM OpenAI Server Expansion: Expand OpenAI server hardware querying to surface system hardware and model specifications. - CLI Output Format Enhancements: Expand and simplify CLI output format options, with support for CSV and more consistent reporting. - Dataset Analysis Pathways: Enable detailed analysis of one or more datasets within the GuideLLM framework. - Model Analysis Pathways: Add support for analyzing one or more models, including accuracy evaluations and detailed reports. - Accuracy Evaluation Enablement: Add infrastructure for supporting common accuracy eval pathways initially targeting LM Eval harness. - Single loop dataset benchmarks: Add support for looping through a dataset once for a given benchmark. - Benchmark warmup: Add support for warmup runs to be done before counting performance benchmarks. ## Expected Improvements - End-to-End Testing Expansion: Enable and expand end-to-end testing for benchmarking workflows. - Integration Testing: Expand integration tests to ensure seamless performance across various backend and CLI pathways. ## Expected Bug Fixes - https://github.com/neuralmagic/guidellm/issues/34 - https://github.com/neuralmagic/guidellm/issues/38 ## Milestones & Timeline - Development: Sept 01, 2024 - Sept 30, 2024 - QA: Sept 30, 2024 - TBD - Feature Freeze: ~Sept 30, 2024 - Documentation Finalization: ~Sept 30, 2024 - Release: ~Sept 30, 2024 ## Testing Requirements - **Unit Tests:** All newly implemented features must have accompanying unit tests that ensure full coverage. Code coverage should remain at 85% or higher. - **Integration Tests:** Ensure all integrations with vLLM, DeepSparse, and other backends are fully tested, covering edge cases and normal workflows. - **End-to-End (E2E) Tests:** Run complete e2e tests for all CLI workflows, including benchmarks, report generation, dataset/model analysis, and output formats. - **Manual Testing:** QA must conduct manual testing on all core features, including the new HTML report UI and dataset analysis workflows, ensuring usability and functionality. ## Documentation Requirements - Docs site released. - Supporting docs/guides for new features including model analysis, dataset analysis, accuracy evals, output formats, HTML report, CI/CD flows. - Docs expansion with CLI guide, examples guide, architecture documentation, API docs.

    Overdue by 2 month(s)
    Due by March 13, 2025
    4/8 issues closed