Skip to content

yuki-2025/mRAG

Repository files navigation

mRAG : Multimodal RAG — Paper Q&A System Based on Qwen2VL mini-4o + Evaluation

This repo introduces the full process of building an mRAG (Multimodal Retrieval-Augmented Generation) application and provides detailed explanations of its key principles.

image

Please refer to my blog for the code explanation and complete details.

Python Environment

  • Install environment
pip install -r requirements.txt

Download Data

So all the required data is already listed in this script.

bash download_data.sh

Downlaod Model

If you need to download them in parts, please refer to the script comments.

bash download_models.sh

SFT

  • single GPU
CUDA_VISIBLE_DEVICES=0 python qwen25vl_sft.py
  • multi GPU
accelerate launch --config_file accelerate_config.yaml qwen25vl_sft.py

# For background execution, it’s best to change to absolute paths.
nohup accelerate launch --config_file accelerate_config.yaml qwen25vl_sft.py > logs/output_pt.log 2>&1 &

mRAG

  • Data Synthesis
CUDA_VISIBLE_DEVICES=0 python mini_vlm/qwen25vl_mRAG_eval_data.py
  • Evaluate

Need DeepSeek API Key

CUDA_VISIBLE_DEVICES=0 python mini_vlm/qwen25vl_mRAG_eval.py

About

Multimodal RAG — Paper Q&A System Based on Qwen2VL mini-4o

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published