Skip to content

HKUGenAI/LMM_RAG_Workshop_GPU

Repository files navigation

LMM_RAG_Workshop_GPU

Before start:

make sure you have

  1. models folder if not get it from here:https://drive.google.com/drive/folders/11BEK-gFWjFB1Qb3mxHMg1OCYQn-QtV29?usp=drive_link
    /models

    /Layout
    /MFD
    /MFR
    README.MD

2.EngineeringHistory3Books_text.parquet if not get it from here: https://drive.google.com/file/d/1DwXRLUqc7W4fLAtZR3XWiLva0Dc2VBAY/view?usp=sharing

conda environment that used for this part is: LMMRAGwithGPU from computer 391
.env for test can use .env_for_testing

Part1 Database Preparation

including:

  • image extrcation
  • captiongeneration
  • text OCR

these 3 using same environment basically start from 1 -> 2 -> 3

  1. imageextract.ipynb -> will give you crop image folder and full page folder, and also give you .json pairing each image to page number
  2. captiongeneration.ipynb -> will give you .json of image and associate caption
  3. textOCR.ipynb -> will give you .json of Text OCR

So after these 3 steps you will get

  1. imagecaption.json
  2. text.json

Part2 Embedding and Searching

including:

  • embed.ipynb

You may reuse the conda env from part 1.

Pipeline

  1. After Part1 you get .json for image and .json for text dataset
  2. parquet files: Run embed.ipynb to read from the above 2 json files and embed both, stored to xxx_text.parquet and xxx_image.parquet
  3. RAG: Run rag.ipynb to perform vector search and get RAG results.

Part3 Generation

run rag.ipynb

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published