-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Labels
effort: highSignificant effort (1+ week)Significant effort (1+ week)enhancementNew feature or requestNew feature or requestfeature: transcriptionTranscription coreTranscription corepriority: criticalMust have for adoptionMust have for adoption
Description
Summary
Automatically detect and label different speakers in the transcript (Speaker 1, Speaker 2, or custom names).
Why This Matters
- Table stakes for interviews, podcasts, meetings
- Enables speaker-specific persona generation
- Required for professional transcription workflows
- Differentiator vs basic Whisper implementations
Acceptance Criteria
- Automatic speaker detection and labeling
- Ability to rename speakers (e.g., "Speaker 1" → "Marc")
- Visual distinction between speakers in transcript
- Speaker labels included in exports (SRT, VTT, etc.)
- Per-speaker statistics (talk time, word count)
Technical Options
- pyannote-audio - Best accuracy, requires HuggingFace token
- whisperx - Whisper + diarization integrated
- Simple-diarizer - Lightweight alternative
- Manual labeling - Fallback for users who want control
Implementation Notes
# Example with pyannote
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization")
diarization = pipeline(audio_file)Priority
🔴 Critical - High effort but essential for target market
Generated from LLM Council product review
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
effort: highSignificant effort (1+ week)Significant effort (1+ week)enhancementNew feature or requestNew feature or requestfeature: transcriptionTranscription coreTranscription corepriority: criticalMust have for adoptionMust have for adoption