This repository contains the code for the paper “C$^2$-Cite: Contextual-Aware Citation Generation for \ Attributed Large Language Models”. The project is based on the open-source repository"TUDB-Labs/MoE-PEFT". C$^2$-Cite is a model that can answer the questions with citation markers.
- config: Including the configurations of training or evaluating
- c2cite/backends: Some backend tools for GMoE.
- c2cite/common: The implementation of Transformer architecture.
- c2cite/models: The implementation of some series of Transformer-based models.
- c2cite/tasks: The implementation of datasets.
- c2cite.py The start file of this project.
- python3=3.11
- pytorch >= 2.1.2
- Other dependencies, See
requirements.txt
- [Llama-3-8B-inst]
To get Training dataset proposed in paper "Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering", you can download SynSciQA here. And please put SynSciQA.json, SynSciQA+.json, SynSciQA++.json in ./dataset/SynSciQA
We evaluate our model and baselines using ALCE. To get Evaluate datasets, please run
bash download_test_data.shReplace the [base model] and the [train/evaluate config] below with the directory of base model and the configuration in Folder "config".
python c2cite.py --dir ./checkpoint --log_file ./logs --verbose --seed 42 --attn_impl eager --base_model [base model] --config [train/evaluate config] --device cuda:0After training process, we can conduct the evaluation step with the command below:
python c2cite.py --dir ./checkpoint --log_file ./logs --verbose --seed 42 --attn_impl eager --base_model [base model] --config [train/evaluate config] --device cuda:0 --evaluateNote: Do not change the information in the train config after training step, or it won't find the right adapter.