Riding the Polar Express to Convergence: Getting Attention on the Track to Stability

Follow the steps below to reproduce the results presented in our main paper.

Environment setup

Install miniconda, then run the following commands

# 1. Create a conda environment
conda env create -f environments.yaml
conda activate cs182-rep

[optional] Add the following to the end of your .bashrc file to enter this environment automatically upon shell creation: conda activate cs182-rep

Steps to Running Code

Start by changing directories to the GPT-opt directory:

cd GPT-opt

Step 1: Download the fineweb dataset

Next, define the DATA_DIR environment variable. This defines where the FineWeb1B dataset will be stored.

export DATA_DIR = "absolute/path/to/data/dir"
python3 process_data.py --name fineweb1B

Step 2: Login to wandb

Run wandb login to login to your wandb account.

Step 3: Run a sweep

Execute the following sequence of commands to run a simple sweep. NOTE: Your GPU must have at least 20 GB of VRAM to execute all the runs without Out-of-Memory Errors

# Create a new wandb sweep
wandb sweep sweeps/phase1-safety-sweep.yaml
# Run the sweep command 
CUDA_VISIBLE_DEVICES=0 wandb agent your/wandb/sweep/id

As an example, your command should look like the following:

CUDA_VISIBLE_DEVICES=0 wandb agent justin_yang-university-of-california-berkeley/cs182-project-GPT-opt/2jfn4wyv

The Figure below depicts a successful initiation of a run:

Reproducing Plots

Download data [if you want to use your own sweep data]

Download your custom wanb sweep data to a local computer using the notebook download-data.ipynb

Generate plots [use our included data]

If you are using your own data, within the notebook, there are code cells explaining how to establish sweep name mappings to the wandb unique sweep_id.

The data we got from our sweeps is provided to you in the plotting/data folder.

To create the plots, run this notebook: analyze-data.ipynb

The notebook is set up for the cells to be run in order using our data.

Name		Name	Last commit message	Last commit date
Latest commit History 165 Commits
GPT-opt		GPT-opt
docs		docs
plotting		plotting
polar-express		polar-express
sweep		sweep
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environments.yaml		environments.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Riding the Polar Express to Convergence: Getting Attention on the Track to Stability

Environment setup

Steps to Running Code

Step 1: Download the fineweb dataset

Step 2: Login to wandb

Step 3: Run a sweep

Reproducing Plots

Download data [if you want to use your own sweep data]

Generate plots [use our included data]

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

License

nraultwang/cs182-project

Folders and files

Latest commit

History

Repository files navigation

Riding the Polar Express to Convergence: Getting Attention on the Track to Stability

Environment setup

Steps to Running Code

Step 1: Download the fineweb dataset

Step 2: Login to wandb

Step 3: Run a sweep

Reproducing Plots

Download data [if you want to use your own sweep data]

Generate plots [use our included data]

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages