Speaker Diarization Project

This repository contains the code and resources for a speaker diarization project. Speaker diarization is the task of segmenting an audio recording into homogeneous regions based on speaker identities. This project aims to develop a speaker diarization for two speakers using unsupervised deep learning, in the form of neural network and clustering.

Project Structure

The repository is organized as follows:

data/: This directory contains the audio data used for training and evaluation.
src/: This directory contains the source code for the speaker diarization system. It includes the following classes and their functionalities:
- Diarization: Applying speaker diarization on data, with several options
- Visualization: Generate a diarization plot and .mp4 clip of animation of the speakers by time with the audio.
- noise clean: Noise cleaning filters to be applied on audio.
- help_func: Utility functions for diarization
pretrained_models/: Pretrained models or checkpoints that can be used for inference or fine-tuning. It is recommended to download from Hugging Face.
utils/: Utility scripts and helper functions used throughout the project.
LICENSE: The license file for this project.
README.md: This file, providing an overview of the project and instructions for getting started.

Getting Started

To get started with this project, follow the steps below:

Clone the repository:

git clone https://github.com/n242/HBL.git

Make sure you have python 3.8+ installed
Install the required dependencies. Please refer to the requirements.txt file for the list of dependencies. You can install them using the following command:
```
pip install -r requirements.txt
```
Download or prepare the audio data for diarization and evaluation. Place the data in the data/ directory following the instructions provided in the data/README.md file.
Train a model or download one from Hugging Face (see instructions below)
Add your input path to main.py and run the speaker diarization code, visualize the data.

Use pre-trained module from Hugging Face

visit hf.co/pyannote/speaker-diarization and hf.co/pyannote/segmentation and accept user conditions (only if requested)
visit hf.co/settings/tokens to create an access token (only if you had to go through 1.)
Insert your auth_token to main.py.

Example Interview

You can view an example output of an interview between two speakers.

And the full video at: example interview

Using the repository

When generating the .csv we assume the interviwer opens the interview.

Contribution Guidelines

Contributions to this project are welcome. If you encounter any issues or have suggestions for improvements, please feel free to submit an issue or a pull request.

License

This project is licensed under the MIT License.

Acknowledgements

We relied upon the following:

pyannote-audio

noisereduce

Authors

Neta Oren and Faisal Omari This project was part of the Computational Research of Human Behavior course as part of their M.Sc and B.Sc. in Computer Science at the University of Haifa. The project was performed under the supervision of Prof. Hagit Hel-Or.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
data		data
output_wavs		output_wavs
results		results
src		src
visual_outputs		visual_outputs
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Speaker Diarization Project

Project Structure

Getting Started

Use pre-trained module from Hugging Face

Example Interview

Using the repository

Contribution Guidelines

License

Acknowledgements

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

n242/HBL

Folders and files

Latest commit

History

Repository files navigation

Speaker Diarization Project

Project Structure

Getting Started

Use pre-trained module from Hugging Face

Example Interview

Using the repository

Contribution Guidelines

License

Acknowledgements

Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages