Skip to content

Commit 64e7b64

Browse files
authored
Merge branch 'master' into fix/fix_type_errors_fab
2 parents c348dc3 + 9904510 commit 64e7b64

125 files changed

Lines changed: 8746 additions & 2049 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/pyright.yml

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
name: 📐 Pyright type checks
2+
on:
3+
push:
4+
branches:
5+
- master
6+
pull_request:
7+
workflow_dispatch:
8+
9+
jobs:
10+
pyright:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- uses: actions/checkout@v6
14+
15+
- name: Set up Python 3.12
16+
uses: actions/setup-python@v6
17+
with:
18+
python-version: "3.12"
19+
20+
- name: Install uv and create venv
21+
run: |
22+
pipx install uv
23+
uv venv .venv
24+
25+
- name: Install dependencies
26+
run: |
27+
source .venv/bin/activate
28+
uv pip install -e ".[process,index,rag,api,cpu,dev,websearch]"
29+
30+
- name: Run Pyright
31+
continue-on-error: true
32+
run: |
33+
source .venv/bin/activate
34+
pyright

.github/workflows/sphinx-docs.yml

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
name: Deploy Sphinx documentation
2+
3+
on:
4+
push:
5+
branches: ["master"]
6+
workflow_dispatch:
7+
8+
permissions:
9+
contents: read
10+
pages: write
11+
id-token: write
12+
13+
concurrency:
14+
group: "pages"
15+
cancel-in-progress: true
16+
17+
jobs:
18+
build:
19+
runs-on: ubuntu-latest
20+
21+
steps:
22+
- name: Checkout repository
23+
uses: actions/checkout@v4
24+
25+
- name: Set up Python
26+
uses: actions/setup-python@v5
27+
with:
28+
python-version: "3.11"
29+
30+
- name: Install dependencies
31+
run: |
32+
python -m pip install --upgrade pip
33+
pip install -r docs/requirements.txt
34+
35+
- name: Build Sphinx HTML
36+
run: |
37+
sphinx-build -b html docs/source docs/_build/html
38+
39+
- name: Setup Pages
40+
uses: actions/configure-pages@v5
41+
42+
- name: Upload artifact
43+
uses: actions/upload-pages-artifact@v3
44+
with:
45+
path: docs/_build/html
46+
47+
deploy:
48+
environment:
49+
name: github-pages
50+
url: ${{ steps.deployment.outputs.page_url }}
51+
runs-on: ubuntu-latest
52+
needs: build
53+
54+
steps:
55+
- name: Deploy to GitHub Pages
56+
id: deployment
57+
uses: actions/deploy-pages@v4

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,7 @@ venv.bak/
114114
# Milvus DB
115115
db/
116116
*.db
117+
*.db.lock
117118

118119
# Project files
119120
tmp/

README.md

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,13 @@
1616

1717
MMORE is an open-source, end-to-end pipeline to ingest, process, index, and retrieve knowledge from heterogeneous files: PDFs, Office docs, spreadsheets, emails, images, audio, video, and web pages. It standardizes content into a unified multimodal format, supports distributed CPU/GPU processing, and provides hybrid dense+sparse retrieval with an integrated RAG service (CLI, APIs).
1818

19-
👉 Read the paper for more details (OpenReview): [MMORE: Massive Multimodal Open RAG & Extraction](https://openreview.net/forum?id=6j1HjfIdKn)
19+
👉 Read the paper for more details (arXiv): [MMORE: Massive Multimodal Open RAG & Extraction](https://arxiv.org/abs/2509.11937)
20+
21+
22+
### Documentation
23+
24+
👉 Read the full documentation here: [MMORE Documentation](https://swiss-ai.github.io/mmore/).
25+
2026

2127
## :bulb: Quickstart
2228

@@ -60,6 +66,8 @@ brew install cairo pango gdk-pixbuf libffi
6066
uv pip install weasyprint
6167
```
6268

69+
You can also run MMORE on Windows by following our [Windows setup notes](docs/source/getting_started/windows.md).
70+
6371
#### Step 1 – Install MMORE
6472

6573
Dependencies are split by pipeline stage. Install only what you need:
@@ -97,6 +105,22 @@ uv pip install "mmore[process,cpu]"
97105
98106
> :warning: **Check the instructions for contributors directly at [`docs/for_devs.md`](./docs/for_devs.md)**
99107
108+
### Interactive TUI
109+
110+
Prefer a guided experience over editing YAML by hand? Install the `tui` extra and launch the interactive Terminal UI:
111+
112+
```bash
113+
uv sync --extra tui
114+
mmore tui
115+
```
116+
117+
From the launcher you can:
118+
119+
- run any stage (process / postprocess / index / rag / chat) interactively,
120+
- chain the full pipeline (process → postprocess → index → chat),
121+
- generate stage YAML configs through a guided wizard,
122+
- pick from existing example configs without leaving the terminal.
123+
100124
### Minimal Example
101125

102126
You can use our predefined CLI commands to execute parts of the pipeline. Note that you might need to prepend `python -m` to the command if the package does not properly create bash aliases.

docker/arch/Dockerfile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,5 +55,8 @@ RUN uv pip install --no-cache --no-deps -e .
5555
# --- Runtime ---
5656
ENV PATH="/app/.venv/bin:$PATH"
5757
ENV DASK_DISTRIBUTED__WORKER__DAEMON=False
58+
ENV HF_HOME="/home/mmoreuser/.cache/huggingface"
59+
ENV TORCH_HOME="/home/mmoreuser/.cache/torch"
60+
ENV XDG_CACHE_HOME="/home/mmoreuser/.cache"
5861

5962
ENTRYPOINT ["/bin/bash"]

docker/arch/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,8 @@ sudo docker build -f docker/arch/Dockerfile --build-arg USER_UID=$(id -u) --buil
3333

3434
```bash
3535
# GPU
36-
sudo docker run --gpus all -it -v ./examples:/app/examples -v ./.cache:/mmoreuser/.cache mmore:arch
36+
sudo docker run --gpus all -it -v ./examples:/app/examples -v ./.cache:/home/mmoreuser/.cache mmore:arch
3737

3838
# CPU-only
39-
sudo docker run -it -v ./examples:/app/examples -v ./.cache:/mmoreuser/.cache mmore:arch-cpu
39+
sudo docker run -it -v ./examples:/app/examples -v ./.cache:/home/mmoreuser/.cache mmore:arch-cpu
4040
```

docker/leap/Dockerfile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,5 +44,8 @@ RUN .venv/bin/uv pip install --no-cache --no-deps -e .
4444
# --- Runtime ---
4545
ENV PATH="/app/.venv/bin:$PATH"
4646
ENV DASK_DISTRIBUTED__WORKER__DAEMON=False
47+
ENV HF_HOME="/root/.cache/huggingface"
48+
ENV TORCH_HOME="/root/.cache/torch"
49+
ENV XDG_CACHE_HOME="/root/.cache"
4750

4851
ENTRYPOINT ["/bin/bash"]

docker/ubuntu/Dockerfile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,5 +53,8 @@ RUN .venv/bin/uv pip install --no-cache --no-deps -e .
5353
# --- Runtime ---
5454
ENV PATH="/app/.venv/bin:$PATH"
5555
ENV DASK_DISTRIBUTED__WORKER__DAEMON=False
56+
ENV HF_HOME="/home/mmoreuser/.cache/huggingface"
57+
ENV TORCH_HOME="/home/mmoreuser/.cache/torch"
58+
ENV XDG_CACHE_HOME="/home/mmoreuser/.cache"
5659

5760
ENTRYPOINT ["/bin/bash"]

docker/ubuntu/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,8 @@ sudo docker build -f docker/ubuntu/Dockerfile --build-arg USER_UID=$(id -u) --bu
4040

4141
```bash
4242
# GPU
43-
sudo docker run --gpus all -it -v ./examples:/app/examples -v ./.cache:/mmoreuser/.cache mmore
43+
sudo docker run --gpus all -it -v ./examples:/app/examples -v ./.cache:/home/mmoreuser/.cache mmore
4444

4545
# CPU-only
46-
sudo docker run -it -v ./examples:/app/examples -v ./.cache:/mmoreuser/.cache mmore:cpu
46+
sudo docker run -it -v ./examples:/app/examples -v ./.cache:/home/mmoreuser/.cache mmore:cpu
4747
```

docs/Makefile

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
SPHINXBUILD = sphinx-build
2+
SOURCEDIR = source
3+
BUILDDIR = build
4+
5+
.PHONY: html clean
6+
7+
html:
8+
$(SPHINXBUILD) -M html "$(SOURCEDIR)" "$(BUILDDIR)"
9+
10+
clean:
11+
rm -rf "$(BUILDDIR)"

0 commit comments

Comments
 (0)