Skip to content

adithya-s-k/Omnidocs

Repository files navigation

🚀 OmniDocs

OmniDocs Banner

Unified, modern, and blazing-fast Document AI for Python
CI Status PyPI version License

OmniDocs is your all in one Python toolkit for extracting tables, text, math, and OCR from PDFs and image, powered by classic libraries and state of the art deep learning models. Build robust document workflows with a single, consistent API.

  • 🧩 Unified, production-ready API for all tasks
  • 🏎️ Fast, GPU-accelerated, and easy to extend

⚡ Quick Start

Get started quickly with practical examples for various document processing tasks in the Quick Start Guide.

🏁 Get Started

📖 Tutorials


🛠️ Installation

Choose your preferred method:

  • PyPI (Recommended):
    pip install omnidocs
  • uv pip (Fastest):
    uv pip install omnidocs
  • From Source:
    git clone https://github.com/adithya-s-k/OmniDocs.git
    cd OmniDocs
    pip install . 
    or 
    uv sync 
  • Conda (if available):
    conda install -c conda-forge omnidocs

🏗️ How It Works

OmniDocs organizes document processing tasks into modular components. Each component corresponds to a specific task and offers:

  1. A Unified Interface: Consistent input and output formats.
  2. Model Independence: Switch between libraries or models effortlessly.
  3. Pipeline Flexibility: Combine components to create custom workflows.

📈 Roadmap

  • Add support for semantic understanding tasks (e.g., entity extraction).
  • Integrate pre-trained transformer models for context-aware document analysis.
  • Expand pipelines for multilingual document processing.
  • Add CLI support for batch processing.

🤝 Contributing

We welcome contributions to OmniDocs! Here's how you can help:

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix.
  3. Commit your changes and open a pull request.

For more details, refer to our CONTRIBUTING.md.

🛡️ License

This project is licensed under multiple licenses, depending on the models and libraries you use in your pipeline. Please refer to the individual licenses of each component for specific terms and conditions.

🌟 Support the Project

If you find OmniDocs helpful, please give us a ⭐ on GitHub and share it with others in the community.

🗨️ Join the Community

For discussions, questions, or feedback:

About

OmniDocs📄 - One stop deep document processing framework

Resources

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages