Skip to content

Commit 67851db

Browse files
author
My Name
committed
0.2 final
1 parent 0ea5a1e commit 67851db

7 files changed

+186
-20
lines changed

Diff for: README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Originally created by Guilherme da Silveira as "Telly Spelly".
1515

1616
## Project Structure
1717

18-
- `blaze/` - Core application code
18+
- `blaze/` - Core application files
1919
- `docs/` - Documentation files
2020
- `install.py` - Installation script
2121
- `uninstall.py` - Uninstallation script

Diff for: docs/activeContext.md

+23-6
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,14 @@
22

33
## Current Work Focus
44

5-
The current focus of the Syllablaze project is to optimize the application for Ubuntu KDE environments and rebrand from "Telly Spelly" to "Syllablaze". This involves:
5+
The current focus of the Syllablaze project is to optimize the application for Ubuntu KDE environments and enhance the Whisper model management functionality. This involves:
66

77
1. Modifying the installation script to better handle Ubuntu-specific dependencies and paths
88
2. Implementing more robust error handling for system libraries
99
3. Updating all references from "telly-spelly" to "syllablaze" throughout the codebase
10-
4. Documenting the changes and creating comprehensive memory bank files
11-
5. Exploring the potential for a Flatpak version in the future
10+
4. Implementing a comprehensive Whisper model management interface
11+
5. Documenting the changes and creating comprehensive memory bank files
12+
6. Exploring the potential for a Flatpak version in the future
1213

1314
## Recent Changes
1415

@@ -35,6 +36,13 @@ The current focus of the Syllablaze project is to optimize the application for U
3536
- Updated desktop file to use run-syllablaze.sh script with absolute path
3637
- Ensured the script is executable and properly configured
3738
- Updated installation script to create proper desktop integration
39+
8. **Whisper Model Management**: Implemented a comprehensive model management interface
40+
- Created a table-based UI showing available models with detailed information
41+
- Added visual indicators for downloaded vs. not-downloaded models
42+
- Implemented model download functionality with progress tracking
43+
- Added ability to delete models to free up disk space
44+
- Added ability to set a model as active for transcription
45+
- Implemented storage location display with option to open in file explorer
3846

3947
## Next Steps
4048

@@ -54,8 +62,12 @@ The current focus of the Syllablaze project is to optimize the application for U
5462
- Simplified the installation process
5563
- Improved system dependency checks
5664
8.**Update README**: Revised the README.md file with the new directory structure and installation method
57-
9. **Test Installation**: Verify the installation process works correctly on Ubuntu KDE
58-
10. **Future Exploration**: Begin research on creating a Flatpak version
65+
9.**Implement Whisper Model Management**: Created a comprehensive model management interface
66+
- Implemented table-based UI for model management
67+
- Added download, delete, and activation functionality
68+
- Integrated with settings window
69+
10. **Test Installation**: Verify the installation process works correctly on Ubuntu KDE
70+
11. **Future Exploration**: Begin research on creating a Flatpak version
5971

6072
## Active Decisions and Considerations
6173

@@ -92,4 +104,9 @@ The current focus of the Syllablaze project is to optimize the application for U
92104
7. **Documentation Strategy**:
93105
- Decision: Create comprehensive memory bank files
94106
- Rationale: Ensures project knowledge is preserved and accessible
95-
- Consideration: Will need regular updates as the project evolves
107+
- Consideration: Will need regular updates as the project evolves
108+
109+
8. **Whisper Model Management**:
110+
- Decision: Implement a comprehensive model management interface
111+
- Rationale: Provides better user control over model selection and disk space usage
112+
- Consideration: Need to handle download progress simulation since Whisper API doesn't provide direct progress tracking

Diff for: docs/productContext.md

+34-1
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ Syllablaze exists to bridge the gap between spoken word and digital text. In tod
1212
4. **Content Creation**: Facilitates the creation of written content through speech
1313
5. **Accessibility Needs**: Assists users with physical limitations that make typing difficult
1414
6. **Privacy Concerns**: Provides a local solution that doesn't send audio data to cloud services
15+
7. **Resource Management**: Helps users manage disk space and processing power through flexible model selection
1516

1617
## How It Should Work
1718

@@ -37,6 +38,12 @@ Syllablaze exists to bridge the gap between spoken word and digital text. In tod
3738
- Whisper model selection (balancing speed vs. accuracy)
3839
- Global keyboard shortcuts
3940
- Interface preferences
41+
- Language settings for transcription
42+
7. **Model Management**: Users can manage Whisper models through:
43+
- A table-based interface showing all available models
44+
- Visual indicators for downloaded vs. not-downloaded models
45+
- Buttons to download, delete, or set models as active
46+
- Information about model size and storage location
4047

4148
## User Experience Goals
4249

@@ -48,6 +55,7 @@ Syllablaze exists to bridge the gap between spoken word and digital text. In tod
4855
6. **Confidence**: Users should trust that their audio is being processed correctly
4956
7. **Adaptability**: The application should work well in various environments and use cases
5057
8. **Cross-platform Consistency**: The application should provide a consistent experience across different Linux distributions, with special attention to Ubuntu KDE
58+
9. **Resource Awareness**: The application should help users make informed decisions about resource usage
5159

5260
## Target Users
5361

@@ -57,4 +65,29 @@ Syllablaze exists to bridge the gap between spoken word and digital text. In tod
5765
4. **Researchers**: For recording and transcribing interviews or observations
5866
5. **Accessibility Users**: For those who find typing difficult or impossible
5967
6. **KDE Enthusiasts**: Users who appreciate well-integrated KDE applications
60-
7. **Privacy-conscious Users**: Those who prefer local processing over cloud services
68+
7. **Privacy-conscious Users**: Those who prefer local processing over cloud services
69+
8. **Resource-constrained Users**: Those with limited disk space or processing power who need flexibility in model selection
70+
71+
## Enhanced Model Management Benefits
72+
73+
1. **Informed Decisions**: Users can make informed decisions about which model to use based on:
74+
- Disk space requirements
75+
- Processing speed needs
76+
- Accuracy requirements
77+
2. **Resource Optimization**: Users can:
78+
- Delete unused models to free up disk space
79+
- Choose smaller models for faster processing on less powerful hardware
80+
- Select larger models for better accuracy when needed
81+
3. **Transparency**: Users can see:
82+
- Which models are available
83+
- Which models are downloaded
84+
- Which model is currently active
85+
- Where models are stored on disk
86+
4. **Control**: Users have direct control over:
87+
- Which models to download
88+
- Which models to keep
89+
- Which model to use for transcription
90+
5. **Feedback**: Users receive clear feedback on:
91+
- Download progress
92+
- Success or failure of operations
93+
- Current status of models

Diff for: docs/progress.md

+28-8
Original file line numberDiff line numberDiff line change
@@ -15,23 +15,33 @@
1515
- Progress window with volume meter
1616
- Settings window for configuration
1717
- Notifications for transcription completion
18+
- Comprehensive Whisper model management interface
1819

1920
3. **Installation**:
2021
- Enhanced setup.sh script for user-level installation using pipx
2122
- Desktop file integration with KDE
2223
- Icon integration
2324
- Improved system dependency checks
2425

26+
4. **Whisper Model Management**:
27+
- Table-based UI showing all available models
28+
- Visual indicators for downloaded vs. not-downloaded models
29+
- Model download functionality with progress tracking
30+
- Model deletion capability to free up disk space
31+
- Model activation for transcription
32+
- Storage location display with option to open in file explorer
33+
2534
## What's Left to Build
2635

2736
1. **Flatpak Version**: Create a Flatpak package for improved cross-distribution compatibility
2837
2. **System-wide Installation Option**: Add support for system-wide installation as an alternative to user-level installation
2938
3. **Advanced Error Handling**: Implement more robust error handling for different system configurations
30-
39+
4. **Enhanced Model Information**: Add more detailed model information including accuracy metrics and RAM requirements
40+
5. **Model Performance Benchmarking**: Add functionality to benchmark model performance on the user's hardware
3141

3242
## Current Status
3343

34-
The core functionality works well, but there are opportunities for improvement in error handling and system integration.
44+
The core functionality works well, with significant improvements in the Whisper model management interface. There are still opportunities for enhancement in error handling and system integration.
3545

3646
### Installation Status
3747

@@ -46,12 +56,14 @@ The core functionality works well, but there are opportunities for improvement i
4656
- Transcription accuracy depends on the Whisper model selected
4757
- KDE integration works well on standard KDE Plasma
4858
- Clipboard integration functions as expected
59+
- Whisper model management provides a comprehensive interface for model control
4960

5061
### Documentation Status
5162

52-
- Memory bank files are being created
53-
- README.md needs updating
54-
- Installation instructions need enhancement for Ubuntu KDE
63+
- Memory bank files are being maintained
64+
- README.md has been updated
65+
- Installation instructions have been enhanced for Ubuntu KDE
66+
- Whisper model management plan has been documented and implemented
5567

5668
## Known Issues
5769

@@ -76,12 +88,20 @@ The core functionality works well, but there are opportunities for improvement i
7688
- Transcription can be slow on systems without GPU acceleration
7789
- Large audio files may cause memory issues
7890
- Solution: Add more guidance on model selection based on hardware
91+
- The new model management interface helps users make informed decisions about model selection
7992

80-
5. **Rebranding**:
93+
5. **Rebranding**: ✅ COMPLETED
8194
- References to "telly-spelly" have been updated to "syllablaze" throughout the codebase
8295
- Icon file has been renamed from telly-spelly.png to syllablaze.png
8396
- Desktop file has been updated to use the new name
84-
6. **Version Management**:
97+
98+
6. **Version Management**: ✅ COMPLETED
8599
- Added centralized version number in constants.py
86100
- Added version display in tooltip when hovering on the tray icon
87-
- Added version display in splash screen
101+
- Added version display in splash screen
102+
103+
7. **Whisper Model Management**: ✅ IMPLEMENTED
104+
- Created a comprehensive model management interface
105+
- Implemented table-based UI for model management
106+
- Added download, delete, and activation functionality
107+
- Integrated with settings window

Diff for: docs/systemPatterns.md

+47-1
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ flowchart TD
1515
E --> F[Clipboard Integration]
1616
C --> B
1717
C --> E
18+
C --> G[Model Management]
19+
G --> E
1820
```
1921

2022
## Key Components
@@ -25,6 +27,7 @@ flowchart TD
2527
4. **SettingsWindow**: Provides user interface for configuration
2628
5. **ProgressWindow**: Shows recording and transcription status
2729
6. **GlobalShortcuts**: Manages keyboard shortcuts for controlling the application
30+
7. **WhisperModelManager**: Manages Whisper model download, deletion, and activation
2831

2932
## Key Technical Decisions
3033

@@ -34,6 +37,7 @@ flowchart TD
3437
4. **Local Processing**: All audio processing and transcription happens locally for privacy
3538
5. **User Directory Installation**: Application installs to user's home directory for easier management
3639
6. **Modular Design**: Components are separated for easier maintenance and extension
40+
7. **Table-based Model Management**: Provides a comprehensive interface for managing Whisper models
3741

3842
## Design Patterns in Use
3943

@@ -42,6 +46,7 @@ flowchart TD
4246
3. **Factory Pattern**: Audio and transcription components are created and managed by the main application
4347
4. **Command Pattern**: Actions in the UI trigger specific commands in the backend
4448
5. **State Pattern**: Application manages different states (idle, recording, processing)
49+
6. **Thread Pattern**: Long-running operations like model downloads run in separate threads to keep the UI responsive
4550

4651
## Component Relationships
4752

@@ -70,6 +75,21 @@ sequenceDiagram
7075
WT->>TR: transcription_finished(text)
7176
```
7277

78+
### SettingsWindow and WhisperModelTable
79+
80+
```mermaid
81+
sequenceDiagram
82+
participant SW as SettingsWindow
83+
participant WMT as WhisperModelTable
84+
participant S as Settings
85+
86+
SW->>WMT: create()
87+
WMT->>SW: model_activated(model_name)
88+
SW->>S: set('model', model_name)
89+
WMT->>WMT: download_model(model_name)
90+
WMT->>WMT: delete_model(model_name)
91+
```
92+
7393
### User Interface Flow
7494

7595
```mermaid
@@ -84,16 +104,42 @@ flowchart TD
84104
H --> I[Show Notification]
85105
```
86106

107+
### Model Management Flow
108+
109+
```mermaid
110+
flowchart TD
111+
A[Settings Window] --> B[Model Table]
112+
B -->|Click Download| C[Confirm Download]
113+
C -->|Yes| D[Show Download Progress]
114+
D --> E[Download Model]
115+
E --> F[Update Model List]
116+
B -->|Click Use Model| G[Set Active Model]
117+
G --> H[Update Settings]
118+
B -->|Click Delete| I[Confirm Delete]
119+
I -->|Yes| J[Delete Model File]
120+
J --> K[Update Model List]
121+
```
122+
87123
## Error Handling Strategy
88124

89125
1. **Graceful Degradation**: The application attempts to continue functioning even when parts fail
90126
2. **User Feedback**: Clear error messages are shown to the user
91127
3. **Logging**: Comprehensive logging for debugging
92128
4. **Recovery Mechanisms**: Attempt to recover from errors when possible
129+
5. **Thread Safety**: Ensure thread-safe operations for model downloads and other background tasks
93130

94131
## Ubuntu KDE Optimization Patterns
95132

96133
1. **Path Flexibility**: Support for alternative library paths common in Ubuntu
97134
2. **Dependency Verification**: Check for required system dependencies before installation
98135
3. **Desktop Integration**: Proper integration with KDE's application menu and system tray
99-
4. **Error Suppression**: Handling of ALSA errors that are common in Ubuntu
136+
4. **Error Suppression**: Handling of ALSA errors that are common in Ubuntu
137+
138+
## Whisper Model Management Patterns
139+
140+
1. **Table-based UI**: Provides a clear overview of all available models
141+
2. **Visual Status Indicators**: Shows which models are downloaded and which is active
142+
3. **Background Downloads**: Model downloads run in separate threads to keep the UI responsive
143+
4. **Progress Simulation**: Simulates download progress since the Whisper API doesn't provide direct progress tracking
144+
5. **File System Integration**: Directly manages model files in the Whisper cache directory
145+
6. **User Confirmation**: Requires confirmation before downloading or deleting models

Diff for: docs/techContext.md

+52-1
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,12 @@ openai-whisper (from PyPI)
7979
6. **Desktop Environment**: Optimized for KDE Plasma
8080
- May work in other environments but with limited integration
8181

82+
7. **Whisper API Limitations**: The Whisper API has certain limitations
83+
- No direct method to check which models are downloaded without loading them
84+
- No direct method to get download progress
85+
- No direct method to delete models
86+
- Workarounds implemented in the WhisperModelManager class
87+
8288
## Dependencies
8389

8490
### Direct Dependencies
@@ -136,4 +142,49 @@ openai-whisper (from PyPI)
136142
1. **Recommended IDE**: Visual Studio Code with Python extension
137143
2. **Debugging**: PyQt debugger or standard Python debugger
138144
3. **Testing**: Manual testing of recording and transcription
139-
4. **Version Control**: Git with GitHub for collaboration
145+
4. **Version Control**: Git with GitHub for collaboration
146+
147+
## Whisper Model Management
148+
149+
### Model Storage
150+
151+
1. **Location**: Models are stored in `~/.cache/whisper/` directory
152+
2. **Format**: Models are stored as `.pt` files (PyTorch format)
153+
3. **Naming**: Models are named according to their size (e.g., `tiny.pt`, `base.pt`, etc.)
154+
155+
### Model Information
156+
157+
1. **Available Models**:
158+
- tiny: ~150MB, very fast but basic accuracy
159+
- base: ~300MB, fast with good accuracy
160+
- small: ~500MB, medium speed with very good accuracy
161+
- medium: ~1.5GB, slow with excellent accuracy
162+
- large: ~3GB, very slow with superior accuracy
163+
164+
2. **Model Detection**:
165+
- The application scans the Whisper cache directory to detect downloaded models
166+
- It checks for the existence of model files with the exact name pattern
167+
168+
3. **Model Download**:
169+
- Downloads are handled by the Whisper API's `load_model()` function
170+
- The application simulates download progress since the API doesn't provide direct progress tracking
171+
- Downloads run in a separate thread to keep the UI responsive
172+
173+
4. **Model Deletion**:
174+
- The application directly deletes model files from the cache directory
175+
- It prevents deletion of the currently active model
176+
177+
### UI Components
178+
179+
1. **WhisperModelTable**: A custom widget that displays and manages Whisper models
180+
- Shows model name, status (downloaded/not downloaded), and size
181+
- Provides buttons to download, delete, or set a model as active
182+
- Highlights the currently active model
183+
184+
2. **ModelDownloadDialog**: A dialog that shows download progress
185+
- Displays a progress bar, status text, and estimated time remaining
186+
- Updates smoothly using a timer to simulate download progress
187+
188+
3. **ModelDownloadThread**: A thread that handles model downloads
189+
- Runs the download operation in the background
190+
- Emits signals to update the UI with progress information

Diff for: install.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -114,8 +114,7 @@ def install_with_pipx(skip_whisper=False):
114114
universal_newlines=True
115115
)
116116

117-
# Process output line by line
118-
print(" Verbose installation progress:")
117+
print(" Installation progress:")
119118
current_package = None
120119
pip_install_started = False
121120

0 commit comments

Comments
 (0)