DSCI 691 NLP Group Project

That's What Who Said

This repo contains the code for a collaborative project between 3- Drexel University Grad students, that sought to create a tool to identify speakers in multi-party dialogues using text-based features.

Our motivation was to determine `who said what`

It's important to note that the FCC requirements for speaker identification in closed captioning are intended to ensure accessiblity and equal participation for individuals with hearing impairmaents. By accurately identifying speakers, viewers who rely on closed captionin can better understand and follow conversations, enhancing their overall viewing experience

Description

We conducted five experiments using two pre-trained transformer-based modles (DistilBERT and RoBERTa) to predict if the speaker of a line of dialogue from the television show, The Office, was either "Dwight" or "Not Dwight".

THE QUESTION:

"Dwight" or "Not Dwight"?

TASK
DATA
DATA PREPROCESSING & VISUALIZATION
MACHINE LEARNING
PROJECT REPORT

Installation & Usage

To rerun this project

Clone this repository
Set up your project directories as per the file tree (below)
Model files for all five models are 4.9GB, so download with caution.
step through the 02_transformer_model.ipynb

Credits

Thanks to the good people at:

Deepnote

Kaggle

Hugging Face

Our Team

Kelsey Fox

GitHub

Justin Minnion

Chris Chavez

License

Please refer to:

Hugging Face Privacy Policy for Hugging Face's consent to the terms of usage of their products.

Kaggle Privacy Policy for Kaggle's consent to the terms of usage of their products.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
data		data
img		img
01_preprocess.ipynb		01_preprocess.ipynb
02_transformer_model.ipynb		02_transformer_model.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DSCI 691 NLP Group Project

That's What Who Said

Our motivation was to determine `who said what`

Description

THE QUESTION:

Installation & Usage

Credits

Our Team

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Zu1uDe1ta/thats-what-who-said

Folders and files

Latest commit

History

Repository files navigation

DSCI 691 NLP Group Project

That's What Who Said

Our motivation was to determine who said what

Description

THE QUESTION:

Installation & Usage

Credits

Our Team

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Our motivation was to determine `who said what`

Packages