v0.9.1 - Critical Performance Fix Release

OldCrow released this 17 Aug 02:19

· 89 commits to main since this release

e45ac61

This release addresses critical performance issues and provides a foundation for future GPU acceleration:

🎯 CRITICAL FIXES:

Replaced CACHE_AWARE strategy causing 100x performance regression with GPU_ACCELERATED
GPU_ACCELERATED safely falls back to WORK_STEALING for optimal CPU performance pending implementation
Fixed auto-dispatch first-call initialization for consistent performance

🔧 TECHNICAL IMPROVEMENTS:

Enhanced performance dispatcher with GPU fallback logic and informative logging
Updated all distribution implementations with consistent fallback behavior
Comprehensive strategy migration across entire codebase (29 files changed)

📈 PERFORMANCE BENEFITS:

Eliminates severe performance regression from problematic cache-aware parallelism
Maintains optimal CPU performance through work-stealing fallback
Positions library for future GPU acceleration without breaking changes

🛡️ QUALITY ASSURANCE:

100% API compatibility maintained
All tests, examples, tools, and documentation updated consistently
Legitimate cache-related code (hardware caches) preserved unchanged

This release ensures optimal performance for v0.9.1 while establishing the foundation for GPU acceleration in future versions.

Assets 2