This repo introduces the full process of building an mRAG (Multimodal Retrieval-Augmented Generation) application and provides detailed explanations of its key principles.
Please refer to my blog for the code explanation and complete details.
- Install environment
pip install -r requirements.txtSo all the required data is already listed in this script.
bash download_data.shIf you need to download them in parts, please refer to the script comments.
bash download_models.sh- single GPU
CUDA_VISIBLE_DEVICES=0 python qwen25vl_sft.py- multi GPU
accelerate launch --config_file accelerate_config.yaml qwen25vl_sft.py
# For background execution, it’s best to change to absolute paths.
nohup accelerate launch --config_file accelerate_config.yaml qwen25vl_sft.py > logs/output_pt.log 2>&1 &- Data Synthesis
CUDA_VISIBLE_DEVICES=0 python mini_vlm/qwen25vl_mRAG_eval_data.py- Evaluate
Need DeepSeek API Key
CUDA_VISIBLE_DEVICES=0 python mini_vlm/qwen25vl_mRAG_eval.py