This repo contains the code for a collaborative project between 3- Drexel University Grad students, that sought to create a tool to identify speakers in multi-party dialogues using text-based features.
It's important to note that the FCC requirements for speaker identification in closed captioning are intended to ensure accessiblity and equal participation for individuals with hearing impairmaents. By accurately identifying speakers, viewers who rely on closed captionin can better understand and follow conversations, enhancing their overall viewing experience
We conducted five experiments using two pre-trained transformer-based modles (DistilBERT and RoBERTa) to predict if the speaker of a line of dialogue from the television show, The Office, was either "Dwight" or "Not Dwight".
- "Dwight" or "Not Dwight"?
TASK |
---|
DATA |
DATA PREPROCESSING & VISUALIZATION |
MACHINE LEARNING |
PROJECT REPORT |
To rerun this project
- Clone this repository
- Set up your project directories as per the file tree (below)
- Model files for all five models are 4.9GB, so download with caution.
- step through the 02_transformer_model.ipynb
Thanks to the good people at:
Kelsey Fox
Justin Minnion
Chris Chavez
Please refer to:
Hugging Face Privacy Policy for Hugging Face's consent to the terms of usage of their products.
Kaggle Privacy Policy for Kaggle's consent to the terms of usage of their products.