The Smart Document Router is an open source document processing data layer.
- It ingests unstructured docs through REST APIs and integrations from faxes, emails, and ERPs.
- It processes documents at scale with OCR and LLMs
- And it chunks, embeds, and organizes documents into queriable knowledge bases
The Document Router is designed to work standalone or with a human-in-the-loop, and can process medical, insurance, financial, supply chain, and legal documents.
It acts as a system of record for the extraction schemas and prompts, and it is portable over all major clouds and LLM providers.
A Document Agent is available to configure prompts and extractions, and to review processed results.
Requires Docker on Linux, macOS, or Windows (WSL).
curl -fsSL https://raw.githubusercontent.com/analytiq-hub/doc-router/main/tools/run-doc-router-docker.sh | bash -s -- upcurl -fsSL https://raw.githubusercontent.com/analytiq-hub/doc-router/main/tools/run-doc-router-docker.sh | bash -s -- downThe script prints URLs and sign-in credentials after up. More detail: Docker setup, docrouter_docker, and tools/run-doc-router-docker.sh.
- NextJS, NextAuth, MaterialUI, TailwindCSS
- FastAPI
- MongoDB
- Pydantic
- LiteLLM
- OpenAI, Anthropic, Gemini, Vertex AI for GCP, AWS Bedrock, xAI, OpenRouter...
PyData Boston DocRouter Slides (Feb '24) have more details about tech stack, and how Cursor AI was used to build the DocRouter.
- Smart Document Router Slides from Boston PyData, Spring 2025
- DocRouter.AI: Adventures in CSS and AI Coding, Summer 2025
- Installation
- Development


