Skip to content

v0.9.1 - Critical Performance Fix Release

Choose a tag to compare

@OldCrow OldCrow released this 17 Aug 02:19
· 89 commits to main since this release

This release addresses critical performance issues and provides a foundation for future GPU acceleration:

🎯 CRITICAL FIXES:

  • Replaced CACHE_AWARE strategy causing 100x performance regression with GPU_ACCELERATED
  • GPU_ACCELERATED safely falls back to WORK_STEALING for optimal CPU performance pending implementation
  • Fixed auto-dispatch first-call initialization for consistent performance

🔧 TECHNICAL IMPROVEMENTS:

  • Enhanced performance dispatcher with GPU fallback logic and informative logging
  • Updated all distribution implementations with consistent fallback behavior
  • Comprehensive strategy migration across entire codebase (29 files changed)

📈 PERFORMANCE BENEFITS:

  • Eliminates severe performance regression from problematic cache-aware parallelism
  • Maintains optimal CPU performance through work-stealing fallback
  • Positions library for future GPU acceleration without breaking changes

🛡️ QUALITY ASSURANCE:

  • 100% API compatibility maintained
  • All tests, examples, tools, and documentation updated consistently
  • Legitimate cache-related code (hardware caches) preserved unchanged

This release ensures optimal performance for v0.9.1 while establishing the foundation for GPU acceleration in future versions.