Skip to content

AIPerf v0.2.0

Choose a tag to compare

@saturley-hall saturley-hall released this 24 Oct 22:21

AIPerf Release Notes

Summary

AIPerf v0.2.0 introduces time-based benchmarking with configurable grace periods and request cancellation capabilities. The release adds advanced metrics including goodput measurement for SLO compliance, GPU telemetry, and inter-chunk latency tracking.

New Features

Time-Based Benchmarking

  • Time-based benchmarking support - Run benchmarks for a specified duration with configurable grace periods for more realistic testing scenarios
  • Benchmark grace period - Added grace period functionality to allow for proper warmup and cooldown phases during benchmarking

Request Management & Control

  • Request cancellation - Added ability to cancel requests during benchmarking to test timeout behavior and service resilience
  • Fixed-schedule for Mooncake traces - Enhanced trace replay with fixed-schedule detection and support for non-fixed-schedule trace formats
  • Request rate with concurrency limits - Added ability to limit HTTP connections and control request concurrency for more realistic load testing

Advanced Metrics & Monitoring

  • Goodput metric - Added goodput metric to measure throughput of requests meeting user-defined SLOs, with comprehensive tutorial support
  • GPU Telemetry - Integrated GPU monitoring and telemetry collection for comprehensive performance analysis
  • Inter-chunk-latency metric - Added inter-chunk latency tracking using raw value lists for detailed streaming performance analysis
  • Total ISL/OSL metrics - Added total input/output sequence length metrics with improved CSV/JSON export support
  • Per-record metrics export - Enhanced profile export with per-record metrics in profile_export.jsonl
  • Mixed ISL/OSL distributions - Support for mixed input/output sequence length distributions in benchmarking

Video & Multimedia Support

  • Synthetic video support - Added support for video benchmarking and synthetic video generation

Enhanced Data Management

  • Inputs.json file for dataset traceability - Added dataset traceability through inputs.json file generation
  • Request traceability headers - Added X-Request-Id and X-Correlation-Id headers for improved request tracking

Bug Fixes

Core Functionality

  • ZMQ graceful termination - Fixed graceful termination of ZMQ context to prevent hanging processes
  • Worker count limits - Capped default maximum workers to 32 to prevent resource exhaustion
  • Race conditions in credit issuing - Fixed race conditions in credit issuing strategy for more stable performance
  • Startup error handling - Improved error handling during startup with clear error messages and proper process exit

Request Processing

  • Empty choices array handling - Fixed IndexError when OpenAI choices array is empty
  • Request metadata validation - Fixed bug with request metadata validation for failed requests

Export & Data Handling

  • CSV export logic - Fixed CSV export parsing to ensure correct data formatting
  • JSONL file writing - Resolved issues with writing to JSONL files
  • GenAI-Perf JSON format compatibility - Fixed JSON summary export to match GenAI-Perf format for better compatibility

Platform-Specific Fixes

  • macOS Textual UI Dashboard - Fixed compatibility issues with Textual UI Dashboard on macOS systems
  • Image test random seed - Set proper random seeds for image tests to fix sporadic test failures

Telemetry & Performance

  • Telemetry Manager shutdown timing - Fixed issue where Telemetry Manager shuts down before profile configuration finishes
  • Goodput release issues - Cherry-picked fix for goodput-related release problems
  • CPU usage warnings - Added warnings when worker CPU usage exceeds 85% to help identify performance bottlenecks

Documentation & Tutorials

New Documentation

  • Goodput tutorial - Complete tutorial on using the goodput metric for SLO validation
  • Advanced features tutorials - Tutorials covering advanced benchmarking features
  • Trace replay tutorial with real data - Updated trace replay tutorial with real Mooncake data examples
  • Feature comparison with GenAI-Perf - Added detailed feature comparison matrix between AIPerf and GenAI-Perf

Infrastructure & Development

Build & Dependencies

  • Flexible dependencies - Made package dependencies more flexible for better compatibility
  • PyProject.toml cleanup - Cleaned up and organized pyproject.toml configuration
  • License field compliance - Updated pyproject.toml license field for wheeltamer compliance
  • Dependency updates - Removed pandas dependency (now using numpy only) and updated numpy to 1.26.4

Refactoring & Performance

Core Components

  • Credit processor refactoring - Refactored credit processing system for better performance and maintainability
  • Console output enhancements - Added median values to console output for better statistical insight

Performance Optimizations

  • Performance test marking - Properly marked SSE tests as performance tests for better test organization

Known Issues

  • InvalidStateError - Logs show an InvalidStateError during benchmarking. This is handled gracefully and will not impact benchmark results.