Follow the steps below to reproduce the results presented in our main paper.
Install miniconda, then run the following commands
# 1. Create a conda environment
conda env create -f environments.yaml
conda activate cs182-rep
[optional] Add the following to the end of your .bashrc file to enter this environment automatically upon shell creation: conda activate cs182-rep
Start by changing directories to the GPT-opt directory:
cd GPT-optNext, define the DATA_DIR environment variable. This defines where the FineWeb1B dataset will be stored.
export DATA_DIR = "absolute/path/to/data/dir"
python3 process_data.py --name fineweb1BRun wandb login to login to your wandb account.
Execute the following sequence of commands to run a simple sweep. NOTE: Your GPU must have at least 20 GB of VRAM to execute all the runs without Out-of-Memory Errors
# Create a new wandb sweep
wandb sweep sweeps/phase1-safety-sweep.yaml
# Run the sweep command
CUDA_VISIBLE_DEVICES=0 wandb agent your/wandb/sweep/idAs an example, your command should look like the following:
CUDA_VISIBLE_DEVICES=0 wandb agent justin_yang-university-of-california-berkeley/cs182-project-GPT-opt/2jfn4wyvThe Figure below depicts a successful initiation of a run:

Download your custom wanb sweep data to a local computer using the notebook download-data.ipynb
If you are using your own data, within the notebook, there are code cells explaining how to establish sweep name mappings to the wandb unique sweep_id.
The data we got from our sweeps is provided to you in the plotting/data folder.
To create the plots, run this notebook: analyze-data.ipynb
The notebook is set up for the cells to be run in order using our data.