This repository contains the official implementation of GraLoRA.
- [2025/11] GraLoRA is now available in the HuggingFace PEFT Library 🚀
- [2025/09] GraLoRA is accepted by NeurIPS 2025 as a spotlight🌟
- [2025/05] GraLoRA is open-sourced!
To install the required dependencies:
conda create -n gralora python=3.10 -y
conda activate gralora
pip3 install -r requirements.txt
pip3 install -r additional_requirements.txt
pip3 install -e ./peft --config-settings editable_mode=compat
pip3 install -e ./lm-evaluation-harness --config-settings editable_mode=compat
pip3 install -e ./bigcode-evaluation-harness --config-settings editable_mode=compatFor a stable setup, it is recommended to work on top of the nvcr.io/nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04 Docker image.
-
Code Generation Training Dataset
python3 scripts/tools/dl_dataset.py ise-uiuc/Magicoder-Evol-Instruct-110K data/ise-uiuc/Magicoder-Evol-Instruct-110K
-
Commonsense Reasoning Training Dataset
python3 scripts/tools/dl_dataset.py zwhe99/commonsense_170k data/zwhe99/commonsense_170k
-
Code Generation: This dataset is automatically downloaded when running the evaluation script.
-
Commonsense Reasoning:
Download manually from LLM-Adapters Dataset
and store the dataset in./data/.
Training scripts for all main experiments are available in ./scripts/train.
To reproduce a specific experiment, run:
./scripts/train/$TARGET_TRAIN_SCRIPTA detailed explanation of how to run the scripts can be found in the README_TRAIN.md file.
We provide separate scripts for each task. While their behavior is nearly identical, the instruction formats for Alpaca-chat differ slightly. We adopted the chat formats used by the original repositories:
- RaSA for Code Generation
- LLM-Adapters for Commonsense Reasoning
Evaluation scripts for all tasks are located in ./scripts/eval.
To run an evaluation:
./scripts/eval/$TARGET_EVAL_SCRIPTA detailed explanation of how to run the scripts can be found in the README_EVAL.md file.
EXP_DIR=path_to_this_experiment
python3 code_evaluation.py \
--model $PATH_TO_BASE_MODEL \
--peft_model $PATH_TO_PEFT_MODEL \
--tasks humanevalsynthesize-python \
--prompt alpaca-chat \
--do_sample True \
--temperature 0.2 \
--n_samples 50 \
--batch_size 20 \
--max_length 2048 \
--allow_code_execution \
--precision bf16 \
--metric_output_path $EXP_DIR/metric_output_path.json \
--save_generations \
--save_generations_path $EXP_DIR/save_generations_path.json \
--generation_onlypython3 code_evaluation.py \
--tasks humanevalplus \
--n_samples 50 \
--num_workers 48 \
--timeout 20 \
--k 1 5 10 \
--allow_code_execution \
--metric_output_path $EXP_DIR/metric_output_path.json \
--load_generations_path $EXP_DIR/save_generations_path_humanevalsynthesize-python.json \
--results_path $EXP_DIR/results.jsonThe final result will be saved at EXP_DIR/results.json.
EXP_DIR=path_to_this_experiment
TASK_NAME=target_task_name
python3 commonsense_evaluate.py \
--base_model $PATH_TO_BASE_MODEL \
--peft_model $PATH_TO_PEFT_MODEL \
--dataset $TASK_NAME \
--bf16 | tee -a ${EXP_DIR}/${TASK_NAME}.txtThe final result will be saved at EXP_DIR/${TASK_NAME}.txt.
GraLoRA achieves superior performance on the HumanEval+ task:
@article{jung2025gralora,
title={GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning},
author={Yeonjoon Jung and Daehyun Ahn and Hyungjun Kim and Taesu Kim and Eunhyeok Park},
year={2025},
eprint={2505.20355},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2505.20355},
}This work builds upon the contributions of the following repositories:

