🎙️ Audio Diarization Library

🚧 Repository is under development! Major improvements are on the way. 🚧

🚀 Demo

Try the early version on Colab: https://colab.research.google.com/drive/1YIy4nhF3b6bgrD8rdtWPWimKyKTofBmh?authuser=1

📖 Overview

This library enables audio diarization, the process of segmenting and labeling speakers in an audio file. It supports a variety of algorithms and models, while being flexible for further extensions.

🛠️ Current Capabilities

- 🎧 Speaker Identification: Assigns names to different speakers based on provided samples.
- 🔧 Customizable Algorithms: Choose between various diarization methods.
- ⚙️ Base Functionality: A robust starting point for diarization tasks, with room for optimization.

🛤️ Planned Improvements

📊 Visualization: Add graphs and interactive visualizations for audio data and diarization results.
🧠 Advanced AI: Integrate machine learning (ML) and neural network (NN) features for superior accuracy.
☁️ Cloud Deployment: Deploy the library on cloud platforms (e.g., AWS, Google Cloud) with Docker containerization.
🔧 Optimize code, use faster asr, remove silence faster (potentially remove ai from here)

🛡️ Technologies to Be Added

Data Visualization: Tools like Plotly or Matplotlib for creating intuitive graphs.
ML/NN: Advanced ML and NN techniques for state-of-the-art diarization.
Scalable Deployment: Cloud-hosted solutions using Docker and Kubernetes.

🔨 Technologies Used (So Far)

🐍 Python: Core programming language.
🗣️ Audio Processing: Libraries like Whisper and PyTorch for speaker recognition and segmentation.
🛠️ Custom Algorithms: Implementation of base-level diarization logic.

📌 Notes

This repository is in its early stages. Contributions and suggestions are welcome! See the demo above to explore the current functionality.

📜 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
logs		logs
zakaz_audio		zakaz_audio
.gitignore		.gitignore
LICENSE		LICENSE
Logging_config.py		Logging_config.py
README.md		README.md
audio_processing.py		audio_processing.py
clustering.py		clustering.py
constants.py		constants.py
downloaded.json		downloaded.json
file_manager.py		file_manager.py
main.py		main.py
main_audio.wav		main_audio.wav
main_wav.wav		main_wav.wav
manage_output.py		manage_output.py
sample_path		sample_path
sampling.py		sampling.py
settings.py		settings.py
settings.yaml		settings.yaml
testing.py		testing.py
tests.py		tests.py
transcribe.py		transcribe.py
transcription.docx		transcription.docx
vectorize.py		vectorize.py
video.py		video.py
yandex.py		yandex.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Audio Diarization Library

🚧 Repository is under development! Major improvements are on the way. 🚧

🚀 Demo

📖 Overview

🛠️ Current Capabilities

🛤️ Planned Improvements

🛡️ Technologies to Be Added

🔨 Technologies Used (So Far)

📌 Notes

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎙️ Audio Diarization Library

🚧 Repository is under development! Major improvements are on the way. 🚧

🚀 Demo

📖 Overview

🛠️ Current Capabilities

🛤️ Planned Improvements

🛡️ Technologies to Be Added

🔨 Technologies Used (So Far)

📌 Notes

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages