AIPerf Release Notes

Summary

AIPerf v0.2.0 introduces time-based benchmarking with configurable grace periods and request cancellation capabilities. The release adds advanced metrics including goodput measurement for SLO compliance, GPU telemetry, and inter-chunk latency tracking.

New Features

Time-Based Benchmarking

Time-based benchmarking support - Run benchmarks for a specified duration with configurable grace periods for more realistic testing scenarios
Benchmark grace period - Added grace period functionality to allow for proper warmup and cooldown phases during benchmarking

Request Management & Control

Request cancellation - Added ability to cancel requests during benchmarking to test timeout behavior and service resilience
Fixed-schedule for Mooncake traces - Enhanced trace replay with fixed-schedule detection and support for non-fixed-schedule trace formats
Request rate with concurrency limits - Added ability to limit HTTP connections and control request concurrency for more realistic load testing

Advanced Metrics & Monitoring

Goodput metric - Added goodput metric to measure throughput of requests meeting user-defined SLOs, with comprehensive tutorial support
GPU Telemetry - Integrated GPU monitoring and telemetry collection for comprehensive performance analysis
Inter-chunk-latency metric - Added inter-chunk latency tracking using raw value lists for detailed streaming performance analysis
Total ISL/OSL metrics - Added total input/output sequence length metrics with improved CSV/JSON export support
Per-record metrics export - Enhanced profile export with per-record metrics in profile_export.jsonl
Mixed ISL/OSL distributions - Support for mixed input/output sequence length distributions in benchmarking

Video & Multimedia Support

Synthetic video support - Added support for video benchmarking and synthetic video generation

Enhanced Data Management

Inputs.json file for dataset traceability - Added dataset traceability through inputs.json file generation
Request traceability headers - Added X-Request-Id and X-Correlation-Id headers for improved request tracking

Bug Fixes

Core Functionality

ZMQ graceful termination - Fixed graceful termination of ZMQ context to prevent hanging processes
Worker count limits - Capped default maximum workers to 32 to prevent resource exhaustion
Race conditions in credit issuing - Fixed race conditions in credit issuing strategy for more stable performance
Startup error handling - Improved error handling during startup with clear error messages and proper process exit

Request Processing

Empty choices array handling - Fixed IndexError when OpenAI choices array is empty
Request metadata validation - Fixed bug with request metadata validation for failed requests

Export & Data Handling

CSV export logic - Fixed CSV export parsing to ensure correct data formatting
JSONL file writing - Resolved issues with writing to JSONL files
GenAI-Perf JSON format compatibility - Fixed JSON summary export to match GenAI-Perf format for better compatibility

Platform-Specific Fixes

macOS Textual UI Dashboard - Fixed compatibility issues with Textual UI Dashboard on macOS systems
Image test random seed - Set proper random seeds for image tests to fix sporadic test failures

Telemetry & Performance

Telemetry Manager shutdown timing - Fixed issue where Telemetry Manager shuts down before profile configuration finishes
Goodput release issues - Cherry-picked fix for goodput-related release problems
CPU usage warnings - Added warnings when worker CPU usage exceeds 85% to help identify performance bottlenecks

Documentation & Tutorials

New Documentation

Goodput tutorial - Complete tutorial on using the goodput metric for SLO validation
Advanced features tutorials - Tutorials covering advanced benchmarking features
Trace replay tutorial with real data - Updated trace replay tutorial with real Mooncake data examples
Feature comparison with GenAI-Perf - Added detailed feature comparison matrix between AIPerf and GenAI-Perf

Infrastructure & Development

Build & Dependencies

Flexible dependencies - Made package dependencies more flexible for better compatibility
PyProject.toml cleanup - Cleaned up and organized pyproject.toml configuration
License field compliance - Updated pyproject.toml license field for wheeltamer compliance
Dependency updates - Removed pandas dependency (now using numpy only) and updated numpy to 1.26.4

Refactoring & Performance

Core Components

Credit processor refactoring - Refactored credit processing system for better performance and maintainability
Console output enhancements - Added median values to console output for better statistical insight

Performance Optimizations

Performance test marking - Properly marked SSE tests as performance tests for better test organization

Known Issues

InvalidStateError - Logs show an InvalidStateError during benchmarking. This is handled gracefully and will not impact benchmark results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AIPerf v0.2.0

Choose a tag to compare

Sorry, something went wrong.