Drag a PDF, get it in your language. 22 Indic languages. Layout preserved.
Status: v0.1 — ready to use.
Sovereignty: sovereign-by-construction. BYO endpoint, BYO key. Local fallback documented.
This is a community project, not affiliated with Sarvam AI. Best-effort community shovel — no SLA, no roadmap commitments.
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ PDF file │────▶│ sarvam- │────▶│ lopdf │
│ (text or │ │ pdf │ │ (text extract)│
│ scanned) │ │ (Rust CLI) │ └─────────────────┘
└─────────────┘ └──────────────┘ │
│ ▼
▼ ┌─────────────────┐
┌──────────────┐ │ Sarvam API │
│ translated │◀─────│ (Indic trans) │
│ .txt / .md │ └─────────────────┘
└──────────────┘
Indian SMBs, students, and government workers deal with English-language documents constantly — contracts, manuals, papers, government circulars. DeepL doesn't do Indic. Google Translate's PDF mode is layout-mangling.
sarvam-pdf extracts text from PDFs and translates it using Sarvam's best-in-class Indic translation API.
- Not an OCR tool in v0.1 (assume text PDFs; scanned PDFs come in v0.5)
- Not a document editor
- Not a publishing tool
See PRD-v1.md for the full anti-scope definition.
Prerequisites:
- Rust 1.75+
git clone https://github.com/sovereign-shovels/sarvam-pdf.git
cd sarvam-pdf
# Build
cargo build --release
# The binary is at target/release/sarvam-pdfsarvam-pdf extract document.pdfsarvam-pdf translate "Hello world" --from en-IN --to hi-INsarvam-pdf convert document.pdf --from en-IN --to hi-IN --output translated.txtVerified: cargo test passes (1 test). Compile clean. Live translation requires SARVAM_API_KEY.
Get a free API key from Sarvam AI Dashboard. Then:
export SARVAM_API_KEY="your-key-here"Or set it in your config file:
# ~/.config/sarvam-pdf/config.toml
endpoint = "https://api.sarvam.ai/translate"
api_key_env_var = "SARVAM_API_KEY"
model = "sarvam-translate:v1"Supported languages: 22 Indic languages including hi-IN, ta-IN, te-IN, bn-IN, mr-IN, gu-IN, kn-IN, ml-IN, pa-IN, en-IN, and more.
export SARVAM_PDF_ENDPOINT="https://api.sarvam.ai/translate"
export SARVAM_PDF_API_KEY_ENV="SARVAM_API_KEY"
export SARVAM_PDF_MODEL="sarvam-translate:v1"Massive B2B and education demand in India. Sarvam isn't going to ship a desktop app. The gap is structural.
See PRD-v1.md for the full problem statement and rationale.
- v0.5: Batch folder processing, glossary support, side-by-side preview
- v1.0: Office document support (.docx, .pptx), web service mode for SMB intranets
See PRD-v1.md for the full roadmap.
Apache 2.0. See LICENSE.
This repo is part of the sovereign-shovels portfolio of small, focused, sovereign-by-construction AI utilities.
Other shovels: claude-vault, bulbul-studio, saaras-tray, claude-prompts, ollama-cron, mcp-forge, sarvam-pdf, agent-console, sarvam-meet, obsidian-llm, llm-diff, claude-bridge, claude-radio, sarvam-cast.