Welcome to the MMORE developer documentation!
This guide will help you set up your development environment and contribute to the project.
- 💻 Developer Documentation
Before installing MMORE for development, ensure you have the required system dependencies installed.
sudo apt update
sudo apt install -y ffmpeg libsm6 libxext6 chromium-browser libnss3 \
libgconf-2-4 libxi6 libxrandr2 libxcomposite1 libxcursor1 libxdamage1 \
libxext6 libxfixes3 libxrender1 libasound2 libatk1.0-0 libgtk-3-0 libreoffice \
libpango-1.0-0 libpangoft2-1.0-0 weasyprintOn Ubuntu 24.04, replace `libasound2` with `libasound2t64`.
You may also need to add the Ubuntu 20.04 focal repository to access some packages, for example by creating `/etc/apt/sources.list.d/mmore.list` with:
`deb http://cz.archive.ubuntu.com/ubuntu focal main universe`
brew update
brew install ffmpeg chromium gtk+3 pango cairo \
gobject-introspection libffi pkg-config libx11 libxi \
libxrandr libxcomposite libxcursor libxdamage libxext \
libxrender libasound2 atk libreoffice weasyprintIf weasyprint fails to find GTK or Cairo, also run:
brew install cairo pango gdk-pixbuf libffi
uv pip install weasyprintgit clone https://github.com/swiss-ai/mmore.git
cd mmoreuv venv .venv
source .venv/bin/activate
uv pip install -e ".[all,cpu,dev]"For **GPU (CUDA 12.6)**, replace `cpu` with `cu126`, for example:
`uv pip install -e ".[all,cu126,dev]"`
For a **partial install**, replace `all` with only the stages you need, for example:
`uv pip install -e ".[rag,cpu,dev]"`
Available stages are: `process`, `index`, `rag`, and `api`.
This package requires many large dependencies and a dependency override, so it should be installed with `uv` rather than plain `pip`.
See the [uv guide](../advanced_usage/uv.md) for more information.
MMORE uses several tools to maintain code quality and consistency.
We use pre-commit to automatically run code formatters and linters before each commit.
uv pip install pre-commitpre-commit installOptional but recommended before your first commit.
pre-commit run --all-filesThe pre-commit configuration runs ruff, a code formatter for consistent style.
We use pyright for static type checking.
Please ensure your pull requests are type-checked before submission.
To run type checking manually:
pyrightWe welcome contributions! Here's how you can help:
- Bug reports: open an issue with a clear description, steps to reproduce, and expected vs. actual behavior
- Feature requests: open an issue describing the feature, its use case, and potential implementation approach
- Check the Issues page for ongoing work
- Fork the repository and create a new branch for your feature/fix
- Write clear, documented code following the existing style
- Add tests if applicable
- Ensure all pre-commit hooks pass
- Run type checking with
pyright - Submit a Pull Request with a clear description
mmore/ ├── mmore/ │ ├── process/ # Document processing pipeline │ │ ├── processors/ # Individual file type processors │ │ └── ... │ ├── postprocess/ # Post-processing utilities │ ├── index/ # Indexing and vector DB │ ├── rag/ # RAG implementation │ └── type/ # Type definitions and data models ├── docs/ # Documentation ├── examples/ # Example configurations and data ├── tests/ # Test suite ├── .pre-commit-config.yaml ├── pyproject.toml └── README.md
mmore.process: Handles extraction from various file formatsmmore.index: Manages hybrid dense+sparse indexing with Milvusmmore.rag: RAG system with LangChain integrationmmore.type: Core data structures likeMultimodalSample
pytest tests/Tests requiring a CUDA GPU are marked @pytest.mark.gpu and skipped by
default. Pass --gpu to run them:
pytest --gpu # full suite, including GPU tests
pytest --gpu -m gpu # only the GPU-marked testsTo mark a new GPU-only test:
import pytest
@pytest.mark.gpu
def test_something_on_gpu():
...- Place tests in the
tests/directory - Use descriptive test names
- Cover edge cases and error conditions
- Mock external dependencies when appropriate
- Mark GPU-only tests with
@pytest.mark.gpu(see above)
- Update documentation if you're adding new features
- Add examples for new functionality
- Ensure all tests pass and pre-commit hooks succeed
- Update the changelog if applicable
- Request review from maintainers
- Code follows project style guidelines
- Pre-commit hooks pass (
pre-commit run --all-files) - Type checking passes (
pyright) - Tests are added or updated as needed
- Documentation is updated
- Examples are provided for new features
- Commit messages are clear and descriptive
MMORE ships with a Terminal UI that wraps the CLI commands behind guided menus and config wizards. Useful for trying the pipeline without writing YAML by hand.
Launch it from a project working directory:
mmore tuiFrom the main menu you can:
- Run a single command — pick any stage (
process,postprocess,index,retrieve,rag,ragcli,websearch), then either select an existing YAML, generate one through a guided wizard, or type a path manually. Generated configs are written to./tui-configs/and validated against the stage's dataclass before running. - Run full pipeline — chains
process → postprocess → indexusing existing configs. - Build a full pipeline config (guided wizard) — walks through the three stages in order, wiring the postprocess output JSONL into the index config automatically.
- Chat with indexed documents — shortcut to
ragcli.
Stages whose extras are missing are disabled in the menu with an install hint (e.g. uv sync --extra rag --extra cpu). Press Ctrl-C inside any sub-flow to cancel back to the main menu; press it again at the main menu to quit.
- Use
uv pipinstead ofpipfor all package installations - The project uses dependency overrides that are handled automatically by
uv - See the
uvtutorial for more details
If you have questions about contributing, feel free to:
- Open a discussion on GitHub
- Reach out to the maintainers
- Check existing issues for similar questions
Thank you for contributing to MMORE! 🎉