You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> :warning:**Check the instructions for contributors directly at [`docs/for_devs.md`](./docs/for_devs.md)**
107
107
108
+
### Interactive TUI
109
+
110
+
Prefer a guided experience over editing YAML by hand? Install the `tui` extra and launch the interactive Terminal UI:
111
+
112
+
```bash
113
+
uv sync --extra tui
114
+
mmore tui
115
+
```
116
+
117
+
From the launcher you can:
118
+
119
+
- run any stage (process / postprocess / index / rag / chat) interactively,
120
+
- chain the full pipeline (process → postprocess → index → chat),
121
+
- generate stage YAML configs through a guided wizard,
122
+
- pick from existing example configs without leaving the terminal.
123
+
108
124
### Minimal Example
109
125
110
126
You can use our predefined CLI commands to execute parts of the pipeline. Note that you might need to prepend `python -m` to the command if the package does not properly create bash aliases.
Copy file name to clipboardExpand all lines: docs/source/core_features/colvision.md
+76-64Lines changed: 76 additions & 64 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,49 @@
1
-
# 🖼️ ColPali Integration
1
+
# 🖼️ ColVision Integration
2
2
3
-
## Overview
3
+
PDF retrieval pipeline using ColVision embeddings, stored in Milvus.
4
4
5
-
This module provides a complete pipeline for processing PDF documents with ColPali embeddings, storing them in a Milvus vector database, and performing semantic search.
5
+
## Installation
6
6
7
-
It is designed for efficient document retrieval and RAG applications.
7
+
The `[colvision]` extra is mutually exclusive with `[process]` — use a dedicated venv.
8
+
9
+
```bash
10
+
uv sync --extra colvision
11
+
```
12
+
13
+
## Supported Models
14
+
15
+
| Model |`model_name`|
16
+
|---|---|
17
+
| ColPali v1.3 |`vidore/colpali-v1.3`|
18
+
| ColQwen2 v1.0 |`vidore/colqwen2-v1.0`|
19
+
| ColQwen2.5 v0.2 |`vidore/colqwen2.5-v0.2`|
20
+
| ColGemma3 |`Cognitive-Lab/ColNetraEmbed`|
21
+
| ColSmol 256M |`vidore/colSmol-256M`|
22
+
| ColSmol 500M |`vidore/colSmol-500M`|
23
+
24
+
All models are installed with the single `[colvision]` extra.
25
+
26
+
The model/processor class is auto-detected from `model_name`, and the embedding dimension is inferred at every stage (from the loaded model at `process` / `retrieve` time, from the parquet contents at `index` time).
27
+
28
+
## Choosing a Model
29
+
30
+
Set `model_name` in the YAML config, or override it via the `-m` / `--model` CLI flag on the `process` and `retrieve` commands.
31
+
32
+
The pipeline runs in three steps — `process`, then `index`, then `retrieve` — and the
33
+
`-m` / `--model` flag must be passed to both `process` and `retrieve`:
34
+
35
+
```bash
36
+
# 1. Process PDFs into embeddings
37
+
python3 -m mmore colvision process --config-file examples/colvision/config_process.yml -m vidore/colqwen2.5-v0.2
38
+
39
+
# 2. Index the embeddings into Milvus (no model needed here)
40
+
python3 -m mmore colvision index --config-file examples/colvision/config_index.yml
41
+
42
+
# 3. Retrieve with the same model used at processing time
0 commit comments