156 lines (102 loc) · 3.99 KB

Changelog

[0.3.0] - 2025-07-11

Added

Translate to English setting: Added automatic translation of speech to English
Settings refactored into React hooks for better state management
Audio device switching capability
Hysteresis to VAD (Voice Activity Detection) for more stable recording

Changed

Major audio backend refactor for improved performance and reliability
Moved audio toolkit into src-tauri directory for better permissions handling
Model files no longer need to be downloaded separately for releases
Updated settings components and transcription logic

Fixed

Audio toolkit permissions issues
Various stability improvements

[0.2.3] - 2025-07-03

Fixed

Keycode bug that was causing input issues
Whisper model optimization: switched to unquantized Whisper Turbo, updated Whisper Medium quantization to 4_1

[0.2.2] - 2025-07-02

Fixed

Removed 50ms delay feature flag for Windows (now applies to all platforms for consistency)

[0.2.1] - 2025-07-01

Added

Ctrl+Space key binding for Windows platform

Fixed

Windows crash issue
Model loading on startup when available
Windows paste functionality bug

[0.2.0] - 2025-06-30

Added

Microphone activation on demand: More efficient resource usage
Less permissive VAD settings for better accuracy

Changed

Improved microphone management and activation system

[0.1.6] - 2025-06-30

Added

Multiple models support: Users can now select from different transcription models
Model selection onboarding flow
Cleanup and refactoring of model management

Changed

Enhanced user experience with model selection interface
Better language and UI tweaks

[0.1.5] - 2025-06-27

Added

Different start and stop recording sounds: Enhanced audio feedback
Recording sound samples for better user experience

[0.1.4] - 2025-06-27

Fixed

Build issues
Auto-update functionality improvements

[0.1.3] - 2025-06-26

Fixed

Paste functionality using enigo library for better cross-platform compatibility

[0.1.2] - 2025-06-26

Added

Auto-update functionality: Application can now automatically update itself
Footer displaying current version
Improved menu system

Changed

Better user interface for version management
Enhanced update workflow

[0.1.1] - 2025-06-25

Added

Comprehensive build system: Support for Windows, macOS, and Linux
Windows code signing for trusted installation
Ubuntu/Linux build support with Vulkan
Model file download and packaging for releases
GitHub Actions CI/CD workflow

Changed

Improved build process and release workflow
Better cross-platform compatibility

Fixed

Various build-related issues across platforms

[0.1.0] - 2025-05-16

Added

Initial release of Handy
Basic speech-to-text transcription functionality
Voice Activity Detection (VAD) for automatic recording
Cross-platform support (macOS, Windows, Linux)
Tauri-based desktop application with React frontend
Global keyboard shortcuts for activation
Clipboard integration for automatic text insertion
LLM integration for enhanced transcription processing
Configurable settings including:
- Custom key bindings
- Audio device selection
- Microphone settings
- Push-to-talk functionality
System tray integration with recording indicators
Accessibility permissions handling for macOS
Settings persistence with unified settings store
Background operation capability
Multiple audio format support with on-the-fly resampling
Whisper model integration for high-quality transcription
MIT License for open-source distribution

Technical Implementation

Built with Tauri (Rust backend) and React (TypeScript frontend)
Audio processing with cpal and whisper-rs
Real-time transcription with performance optimizations
Cross-platform keyboard event handling
Modular architecture with managers for audio, models, and transcription