Skip to content

Researching audio perturbation for speech-to-text model poisoning

Notifications You must be signed in to change notification settings

spoonmilk/team-yell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Team Yell: Adversarial Learning on Speech-to-text Models

Read the papers!

See ./team_yell_final_paper.pdf and ./team_yell_poster.pdf

File Structure

  • ./docs/ for internal project documentation
  • ./src/ for project code
  • ./src/models for perturbation models (closed and open)
  • ./src/attacks/ for attack optimization functions
  • ./src/utilities/ for useful functions used in multiple project areas
  • ./src/sketchpad/ for experimentation/fiddling about with whisper
  • ./src/testing/ for model testing functions

All papers are in the top-level directory

Dependencies & Development

REQUIRED: Python 3.12 OR LOWER

Currently, pytorch does not support python 3.13. Oscar uses 3.9

Generate .venv and activate

python3.9 -m venv .venv
source .venv/bin/activate

Install dependencies from requirements.txt

pip install -r requirements.txt

Deactivate environment

deactivate

Data Acquisition

Download LibriSpeech Dataset

# cd team-yell
wget https://us.openslr.org/resources/12/dev-clean.tar.gz
tar -xzf dev-clean.tar.gz
rm dev-clean.tar.gz

Preprocess and Save Waveforms and Transcripts

python3 -m src.utilities.preprocess_wav

Check to make sure that a "data" directory was made in src with two files in it.

Testing

API Key Set Up

Create API keys with Assembly AI, Gladia, and Speechmatics and put them into a .env file in the testing directory.

# cd team-yell/src/testing
touch .env

Format for .env should be:

AAI_API_KEY = "<api_key>"
GLADIA_API_KEY = "<api_key>"
SPEECHMATICS_API_KEY = "<api_key>"

Training

Adjust hyperparameters in ./src/attacks/es_optim.py as wanted and run

python3 -m src.attacks.es_optim

Running Tests

Uncomment the desired tests the bottom of test.py and run.

python3 -m src.testing.test

About

Researching audio perturbation for speech-to-text model poisoning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages