Try the early version on Colab: https://colab.research.google.com/drive/1YIy4nhF3b6bgrD8rdtWPWimKyKTofBmh?authuser=1
This library enables audio diarization, the process of segmenting and labeling speakers in an audio file. It supports a variety of algorithms and models, while being flexible for further extensions.
- 🎧 Speaker Identification: Assigns names to different speakers based on provided samples.
- 🔧 Customizable Algorithms: Choose between various diarization methods.
- ⚙️ Base Functionality: A robust starting point for diarization tasks, with room for optimization.
- 📊 Visualization: Add graphs and interactive visualizations for audio data and diarization results.
- 🧠 Advanced AI: Integrate machine learning (ML) and neural network (NN) features for superior accuracy.
- ☁️ Cloud Deployment: Deploy the library on cloud platforms (e.g., AWS, Google Cloud) with Docker containerization.
- 🔧 Optimize code, use faster asr, remove silence faster (potentially remove ai from here)
- Data Visualization: Tools like Plotly or Matplotlib for creating intuitive graphs.
- ML/NN: Advanced ML and NN techniques for state-of-the-art diarization.
- Scalable Deployment: Cloud-hosted solutions using Docker and Kubernetes.
- 🐍 Python: Core programming language.
- 🗣️ Audio Processing: Libraries like Whisper and PyTorch for speaker recognition and segmentation.
- 🛠️ Custom Algorithms: Implementation of base-level diarization logic.
This repository is in its early stages. Contributions and suggestions are welcome! See the demo above to explore the current functionality.
This project is licensed under the MIT License.