1- # 🖼️ ColPali Integration
1+ # 🖼️ ColVision Integration
22
3- ## Overview
3+ PDF retrieval pipeline using ColVision embeddings, stored in Milvus.
44
5- This module provides a complete pipeline for processing PDF documents with ColPali embeddings, storing them in a Milvus vector database, and performing semantic search.
5+ ## Installation
66
7- It is designed for efficient document retrieval and RAG applications.
7+ The ` [colvision] ` extra is mutually exclusive with ` [process] ` — use a dedicated venv.
8+
9+ ``` bash
10+ uv sync --extra colvision
11+ ```
12+
13+ ## Supported Models
14+
15+ | Model | ` model_name ` |
16+ | ---| ---|
17+ | ColPali v1.3 | ` vidore/colpali-v1.3 ` |
18+ | ColQwen2 v1.0 | ` vidore/colqwen2-v1.0 ` |
19+ | ColQwen2.5 v0.2 | ` vidore/colqwen2.5-v0.2 ` |
20+ | ColGemma3 | ` Cognitive-Lab/ColNetraEmbed ` |
21+ | ColSmol 256M | ` vidore/colSmol-256M ` |
22+ | ColSmol 500M | ` vidore/colSmol-500M ` |
23+
24+ All models are installed with the single ` [colvision] ` extra.
25+
26+ The model/processor class is auto-detected from ` model_name ` , and the embedding dimension is inferred at every stage (from the loaded model at ` process ` / ` retrieve ` time, from the parquet contents at ` index ` time).
27+
28+ ## Choosing a Model
29+
30+ Set ` model_name ` in the YAML config, or override it via the ` -m ` / ` --model ` CLI flag on the ` process ` and ` retrieve ` commands.
31+
32+ The pipeline runs in three steps — ` process ` , then ` index ` , then ` retrieve ` — and the
33+ ` -m ` / ` --model ` flag must be passed to both ` process ` and ` retrieve ` :
34+
35+ ``` bash
36+ # 1. Process PDFs into embeddings
37+ python3 -m mmore colvision process --config-file examples/colvision/config_process.yml -m vidore/colqwen2.5-v0.2
38+
39+ # 2. Index the embeddings into Milvus (no model needed here)
40+ python3 -m mmore colvision index --config-file examples/colvision/config_index.yml
41+
42+ # 3. Retrieve with the same model used at processing time
43+ python3 -m mmore colvision retrieve --config-file examples/colvision/config_retrieval.yml -m vidore/colqwen2.5-v0.2
44+ ```
45+
46+ > ** Important:** the same model must be used across ` process ` and ` retrieve ` — mixing produces incorrect results.
847
948## 🧭 Architecture
1049
@@ -17,28 +56,32 @@ The system consists of three main components:
1756## 📁 File Structure
1857
1958```
20- src/mmore/colpali/
21- ├── milvuscolpali.py # Milvus database management
59+ src/mmore/colvision/
60+ ├── model_utils.py # Model/processor class resolution
61+ ├── milvuscolvision.py # Milvus database management
2262├── run_index.py # Indexing pipeline
23- ├── run_process.py # PDF processing pipeline
63+ ├── run_process.py # PDF processing pipeline
2464├── run_retriever.py # Search and retrieval API
25- └── retriever.py # ColPaliRetriever class for RAG integration
65+ └── retriever.py # ColVisionRetriever class for RAG integration
2666```
2767
2868## 🚀 Quick Start
2969
3070### 1. Process PDFs into embeddings
3171
3272``` bash
33- python3 -m mmore colpali process --config-file examples/colpali/config_process.yml
73+ python3 -m mmore colvision process --config-file examples/colvision/config_process.yml
74+
75+ # Or override the model from the command line
76+ python3 -m mmore colvision process --config-file examples/colvision/config_process.yml --model vidore/colqwen2.5-v0.2
3477```
3578
3679** Example config (` config_process.yml ` ):**
3780``` yaml
3881data_path :
3982 - ' examples/sample_data/pdf'
4083output_path : " ./output"
41- model_name : " vidore/colpali-v1.3 "
84+ model_name : " vidore/colqwen2.5-v0.2 "
4285skip_already_processed : true
4386num_workers : 5
4487batch_size : 8
@@ -47,7 +90,7 @@ batch_size: 8
4790### 2. Index embeddings into Milvus
4891
4992` ` ` bash
50- python3 -m mmore colpali index --config-file examples/colpali /config_index.yml
93+ python3 -m mmore colvision index --config-file examples/colvision /config_index.yml
5194```
5295
5396** Example config (` config_index.yml ` ):**
@@ -57,7 +100,6 @@ milvus:
57100 db_path : ./output/milvus_data.db
58101 collection_name : pdf_pages
59102 create_collection : true
60- dim : 128
61103 metric_type : IP
62104` ` `
63105
@@ -66,47 +108,31 @@ milvus:
66108#### Retrieval Server Mode
67109` ` ` bash
68110# Start the retrieval API server
69- python3 -m mmore colpali retrieve --config-file examples/colpali /config_retrieval.yml
111+ python3 -m mmore colvision retrieve --config-file examples/colvision /config_retrieval.yml
70112```
71113
72114Or with a custom host and port:
73115``` bash
74- python3 -m mmore colpali retrieve --config-file examples/colpali /config_retrieval.yml --host 0.0.0.0 --port 8001
116+ python3 -m mmore colvision retrieve --config-file examples/colvision /config_retrieval.yml --host 0.0.0.0 --port 8001
75117```
76118
77119** Example config (` config_retrieval.yml ` ):**
78120``` yaml
79- db_path : " ./milvus_data"
121+ db_path : " ./output/ milvus_data.db "
80122collection_name : " pdf_pages"
81- model_name : " vidore/colpali-v1.3 "
123+ model_name : " vidore/colqwen2.5-v0.2 "
82124top_k : 3
83- dim : 128
84- max_workers : 16
85125metric_type : " IP"
126+ max_workers : 16
86127text_parquet_path : " ./output/pdf_page_text.parquet"
87128` ` `
88129
89- #### Single Query Mode
90- ` ` ` bash
91- # Run retrieval for a single query defined in the config file
92- python3 -m mmore colpali retrieve --config-file examples/colpali/config_retrieval_single.yml
93- ```
94-
95- ** Example config (` config_retrieval_single.yml ` ):**
96- ``` yaml
97- mode : " single"
98- db_path : " ./milvus_data"
99- collection_name : " pdf_pages"
100- model_name : " vidore/colpali-v1.3"
101- query : " What may lead to dysbiosis and inflammation?"
102- top_k : 5
103- ` ` `
104130Host and port are specified via CLI flags (` --host` and `--port`), not in the config file.
105131
106132# ### Batch Mode
107133` ` ` bash
108134# Process queries from file
109- python3 -m mmore colpali retrieve --config-file examples/colpali /config_retrieval.yml --input-file queries.jsonl --output-file results.json
135+ python3 -m mmore colvision retrieve --config-file examples/colvision /config_retrieval.yml --input-file queries.jsonl --output-file results.json
110136` ` `
111137
112138**Example queries file (`queries.jsonl`):**
@@ -119,20 +145,9 @@ Each line should be a JSON-encoded string (one query per line):
119145
120146Each line must be a valid JSON string, including quotes, since the file is parsed line by line with `json.loads()`.
121147
122- **Example config (`config_retrieval.yml`):**
123- ` ` ` yaml
124- db_path: "./milvus_data"
125- collection_name: "pdf_pages"
126- model_name: "vidore/colpali-v1.3"
127- top_k: 5
128- dim: 128
129- max_workers: 16
130- text_parquet_path: "./output/pdf_page_text.parquet"
131- ` ` `
132-
133148# # 🔧 Core Components
134149
135- # ## MilvusColpaliManager
150+ # ## MilvusColvisionManager
136151- manages local Milvus database operations
137152- handles collection creation and indexing
138153- provides efficient batch insertion
@@ -146,14 +161,14 @@ text_parquet_path: "./output/pdf_page_text.parquet"
146161
147162# ## PDF Processor
148163- converts PDF pages to images
149- - generates ColPali embeddings
164+ - generates ColVision embeddings
150165- handles parallel processing
151166- supports stop-and-resume workflows for large datasets
152167
153168**Processing Flow:**
1541691. Crawl PDF files from specified directories
1551702. Convert each page to high-resolution PNG
156- 3. Generate embeddings using ColPali model
171+ 3. Generate embeddings using the configured model
1571724. Store results in Parquet format
158173
159174# ## Retriever
@@ -193,28 +208,25 @@ curl -X POST "http://localhost:8001/v1/retrieve" \
193208
194209# ## RAG Pipeline Integration
195210` ` ` python
196- from mmore.colpali.retriever import ColPaliRetriever, ColPaliRetrieverConfig
197- from mmore.rag.pipeline import RAGPipeline, RAGConfig
211+ from mmore.colvision.retriever import ColVisionRetriever, ColVisionRetrieverConfig
198212
199- # Create ColPali retriever with text support
200- colpali_config = ColPaliRetrieverConfig(
213+ config = ColVisionRetrieverConfig(
201214 db_path="./output/milvus_data.db",
202215 collection_name="pdf_pages",
203- model_name="vidore/colpali-v1.3 ",
216+ model_name="vidore/colqwen2.5-v0.2 ",
204217 text_parquet_path="./output/pdf_page_text.parquet",
205218 top_k=3,
206- dim=128,
207219 max_workers=16,
208220 metric_type="IP",
209221)
210- colpali_retriever = ColPaliRetriever .from_config(colpali_config )
222+ retriever = ColVisionRetriever .from_config(config )
211223
212224# Use with RAG pipeline (requires LLM config)
213- # rag_config = RAGConfig(retriever=colpali_retriever , ...)
225+ # rag_config = RAGConfig(retriever=retriever , ...)
214226# rag_pipeline = RAGPipeline.from_config(rag_config)
215227` ` `
216228
217- The `ColPaliRetriever ` is a LangChain-compatible `BaseRetriever` that returns `Document` objects with :
229+ The `ColVisionRetriever ` is a LangChain-compatible `BaseRetriever` that returns `Document` objects with :
218230- `page_content` : the text content from the PDF page, if `text_parquet_path` is provided
219231- `metadata` : contains `pdf_name`, `pdf_path`, `page_number`, `rank`, and `similarity` score
220232
@@ -282,13 +294,13 @@ The `ColPaliRetriever` is a LangChain-compatible `BaseRetriever` that returns `D
282294# ## Complete Workflow
283295` ` ` bash
284296# 1. Process all PDFs in a directory
285- python3 -m mmore colpali process --config-file examples/colpali /config_process.yml
297+ python3 -m mmore colvision process --config-file examples/colvision /config_process.yml
286298
287299# 2. Index the embeddings
288- python3 -m mmore colpali index --config-file examples/colpali /config_index.yml
300+ python3 -m mmore colvision index --config-file examples/colvision /config_index.yml
289301
290302# 3. Start the API server
291- python3 -m mmore colpali retrieve --config-file examples/colpali /config_retrieval.yml
303+ python3 -m mmore colvision retrieve --config-file examples/colvision /config_retrieval.yml
292304
293305# 4. Query the system
294306curl -X POST "http://localhost:8001/v1/retrieve" \
@@ -299,13 +311,13 @@ curl -X POST "http://localhost:8001/v1/retrieve" \
299311# ## Alternative: Batch processing
300312` ` ` bash
301313# 1. Process PDFs (same as above)
302- python3 -m mmore colpali process --config-file examples/colpali /config_process.yml
314+ python3 -m mmore colvision process --config-file examples/colvision /config_process.yml
303315
304316# 2. Index embeddings (same as above)
305- python3 -m mmore colpali index --config-file examples/colpali /config_index.yml
317+ python3 -m mmore colvision index --config-file examples/colvision /config_index.yml
306318
307319# 3. Run batch retrieval
308- python3 -m mmore colpali retrieve --config-file examples/colpali /config_retrieval.yml \
320+ python3 -m mmore colvision retrieve --config-file examples/colvision /config_retrieval.yml \
309321 --input-file queries.jsonl \
310322 --output-file results.json
311323` ` `
@@ -319,7 +331,7 @@ python3 -m mmore colpali retrieve --config-file examples/colpali/config_retrieva
319331# ## For better accuracy
320332- use higher DPI in PDF conversion, default is 200
321333- increase `top_k` in retrieval to inspect more candidate pages
322- - consider using larger ColPali models if available
334+ - consider using more recent ColVision models (ColQwen2.5, ColGemma3)
323335
324336# ## For production
325337- run Milvus in distributed mode for larger datasets
0 commit comments