A comprehensive platform for digitizing land records, managing disputed lands, and resolving partition-era ownership issues across India and Pakistan. Built with React, Flask, and powered by Google Vision AI.
- Overview
- Key Features
- Tech Stack
- Prerequisites
- Installation Guide
- Configuration
- Running the Application
- Usage Guide
- Project Structure
- Troubleshooting
- Contributing
- License
AgriStack OCR addresses critical challenges in land administration:
- 70%+ of land records exist only in paper format (Urdu, Hindi, Punjabi)
- 1947 Partition disputes: 14.5 million displaced people with unresolved land claims
- Multi-parcel farmers: No centralized system to track ownership across districts
- Language barriers: Documents inaccessible to non-native speakers
- Digitizes land records using AI-powered OCR (Google Vision API)
- Translates documents between Urdu, Hindi, Punjabi, and English
- Manages disputed lands with interactive map visualization
- Tracks partition-era claims (refugee/muhajireen land disputes)
- Generates formatted PDFs with AI-powered summaries (Google Gemini)
- Provides centralized database for farmers with multiple parcels
- Google Vision API for handwritten & printed text
- Multi-language support: Urdu, Hindi, Punjabi, English
- PDF generation with formatted output
- Batch processing for large-scale digitization
- AI4Bharat IndicTrans2 for Indic languages
- Legal terminology preservation
- 4 languages: Urdu β Hindi β Punjabi β English
- Interactive OpenStreetMap visualization
- Partition-era dispute tracking (1947 refugee claims)
- Multi-claimant support with CNIC verification
- Court case management with hearing dates
- Geographic filtering by district/tehsil
- Document summarization (5 types: brief, detailed, key points, legal, action items)
- Q&A functionality using Google Gemini
- Smart data extraction from complex documents
- Centralized view of all land parcels (multi-district)
- Document repository with search & filters
- Real-time processing status
- Mobile-responsive design
- React 18.x - UI framework
- TypeScript - Type safety
- Vite - Build tool
- Tailwind CSS - Styling
- React Leaflet - Map visualization
- Framer Motion - Animations
- Python 3.11 - Core language
- Flask 3.1.2 - Web framework
- SQLAlchemy 2.0 - Database ORM
- PostgreSQL (via Supabase) - Production database
- SQLite - Development database
- Google Cloud Vision API - OCR
- Google Gemini AI - Document analysis
- AI4Bharat IndicTrans2 - Translation
- OpenStreetMap - Map tiles
Before you begin, ensure you have the following installed on your computer:
| Software | Version | Download Link | Purpose |
|---|---|---|---|
| Python | 3.11 or higher | python.org | Backend runtime |
| Node.js | 18.0 or higher | nodejs.org | Frontend build tool |
| Git | Latest | git-scm.com | Version control |
| VS Code | Latest | code.visualstudio.com | Code editor (recommended) |
-
Google Vision API Key
- Go to Google Cloud Console
- Create a new project or select existing
- Enable "Cloud Vision API"
- Create API Key
- Copy the key (format:
AIzaSy...)
-
Google Gemini API Key (for AI features)
- Go to Google AI Studio
- Click "Get API Key"
- Copy the key
-
Supabase Account (optional - for production)
- Sign up at supabase.com
- Create a new project
- Get URL and API key from Project Settings
Follow these steps exactly as written. Each command is explained.
- Download Python 3.11+ from python.org
- During installation:
- β Check "Add Python to PATH" (VERY IMPORTANT!)
- Click "Install Now"
- Verify installation:
# Open PowerShell (Windows) or Terminal (Mac/Linux) python --version # Should show: Python 3.11.x
- Download Node.js 18+ from nodejs.org
- Run the installer (just click "Next" through all options)
- Verify installation:
node --version # Should show: v18.x.x or higher npm --version # Should show: 9.x.x or higher
- Download Git from git-scm.com
- Install with default settings
- Verify installation:
git --version # Should show: git version 2.x.x
-
Open PowerShell/Terminal
-
Navigate to where you want the project (e.g., Desktop):
# Windows cd C:\Users\YourUsername\Desktop # Mac/Linux cd ~/Desktop
-
Clone the repository:
git clone https://github.com/ronitrai27/OCR_python_Google-Vison.git cd OCR_python_Google-Vison
OR if you downloaded a ZIP file:
- Extract the ZIP
- Open PowerShell in that folder
- Run:
cd OCR_python_Google-Vison
-
Navigate to backend folder:
cd backend
-
Create a virtual environment (isolated Python environment):
# Windows python -m venv venv # Mac/Linux python3 -m venv venv
-
Activate the virtual environment:
# Windows PowerShell .\venv\Scripts\Activate.ps1 # Windows CMD venv\Scripts\activate.bat # Mac/Linux source venv/bin/activate
You should see
(venv)at the start of your command line -
Install Python dependencies:
pip install -r requirements.txtβ³ This will take 2-5 minutes. You'll see lots of packages being installed.
-
Create
.envfile:# Windows copy .env.example .env # Mac/Linux cp .env.example .env
-
Edit the
.envfile:- Open
.envin Notepad or VS Code - Add your API keys:
GOOGLE_VISION_API_KEY=AIzaSyXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX GOOGLE_GEMINI_API_KEY=AIzaSyYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY # Optional (for production): SUPABASE_URL=https://xxxxx.supabase.co SUPABASE_KEY=your-key-here DATABASE_URL=postgresql://...
- Save the file
- Open
-
Open a NEW PowerShell/Terminal window (keep backend terminal open)
-
Navigate to frontend folder:
cd C:\Users\YourUsername\Desktop\OCR_python_Google-Vison\frontend # (Adjust path to match your location)
-
Install Node dependencies:
npm install
β³ This will take 3-7 minutes. Lots of packages will be downloaded.
-
Create frontend
.envfile:# Windows copy .env.example .env # Mac/Linux cp .env.example .env
IMPORTANT: Your API key won't work until you enable the service!
- Go to: https://console.cloud.google.com/apis/library/vision.googleapis.com
- Select your project from the dropdown (top bar)
- Click the blue "ENABLE" button
- Wait 1-2 minutes for activation
- Enable billing (required, but first 1000 requests/month are FREE)
To test the system with realistic data:
# In backend folder (with venv activated)
python generate_disputed_lands_data.pyThis creates 50 sample disputed land records with map coordinates.
- Open PowerShell in backend folder
- Activate virtual environment:
.\venv\Scripts\Activate.ps1
- Run the server:
python app.py
- You should see:
* Running on http://127.0.0.1:5000 β Google Vision API Key loaded
Keep this terminal window open!
-
Open a NEW PowerShell in frontend folder
-
Run the dev server:
npm run dev
-
You should see:
β Local: http://localhost:5173/ -
Open your browser and go to: http://localhost:5173
π The application should now be running!
- Click "Dashboard" in the navbar (or login first)
- Go to "OCR Scanner" tab
- Upload a document:
- Drag & drop OR click "Browse Files"
- Supported: PDF, JPG, PNG
- Toggle "Use Google Vision API" (recommended for Urdu/Hindi)
- Click "Process Document"
- Wait for processing (10-30 seconds)
- View results:
- Extracted text
- Confidence score
- Detected language
- Actions:
- π Generate PDF - Creates formatted document
- πΎ Save to Database - Permanent storage
- π€ Get AI Summary - Gemini-powered analysis
- π¬ Ask Question - Q&A about document
- Go to "Translation" tab
- Upload document (Urdu, Hindi, Punjabi)
- Select languages:
- Source: Urdu (auto-detected)
- Target: English
- Click "Translate"
- View side-by-side comparison
- Download translated PDF
- Click "Disputed Lands" in navbar
- Toggle between:
- πΊοΈ Map View - Interactive OpenStreetMap
- π List View - Sortable table
- Filter by:
- District
- Tehsil
- Dispute Type (Refugee, Muhajireen, Overlapping, etc.)
- Status (Pending, Court Hearing, Resolved)
- Click on marker/row to view full details:
- Location (Khasra, Mauza, Tehsil)
- All claimants with CNIC
- Historical ownership
- Court case information
- Hearing dates
- Go to "Farmer Registration"
- Fill in details:
- CNIC (National ID)
- Name, Father's Name
- Contact (Phone, Address)
- Land parcels (can add multiple)
- Submit
- View registered farmers in dashboard
OCR_python_Google-Vison/
βββ backend/
β βββ app.py # Flask application entry point
β βββ config.py # Configuration & environment variables
β βββ models.py # Database models (SQLAlchemy)
β βββ extensions.py # Flask extensions (CORS, DB)
β βββ requirements.txt # Python dependencies
β βββ .env # Environment variables (API keys)
β β
β βββ ocr/ # OCR processing modules
β β βββ google_vision_ocr.py # Google Vision API integration
β β βββ lightweight_ocr.py # Tesseract-based OCR
β β βββ image_processing.py # Image preprocessing
β β βββ confidence_scorer.py # Accuracy calculation
β β
β βββ translation/ # Translation services
β β βββ ai4bharat_translator.py # Indic language translation
β β βββ language_detector.py # Auto-detect language
β β βββ transliterator.py # Script conversion
β β
β βββ document/ # Document handling
β β βββ pdf_generator.py # PDF creation
β β βββ upload_handler.py # File uploads
β β βββ rag_document_processor.py # RAG for Q&A
β β
β βββ common/ # Shared utilities
β β βββ gemini_ai.py # Google Gemini integration
β β βββ text_cleaner.py # Text normalization
β β βββ supabase_client.py # Database client
β β
β βββ routes/ # API endpoints
β βββ ocr_routes.py # OCR endpoints
β βββ translation_routes.py # Translation endpoints
β βββ disputed_lands_routes.py # Disputed lands API
β βββ rag_routes.py # RAG/Q&A endpoints
β βββ newsletter_routes.py # Newsletter subscription
β
βββ frontend/
β βββ src/
β β βββ App.tsx # Main application component
β β βββ main.jsx # Entry point
β β β
β β βββ pages/ # Page components
β β β βββ LandingPage.tsx
β β β βββ DashboardPage.jsx # Main dashboard (OCR, Translation)
β β β βββ DisputedLandsPage.jsx # Disputed lands with map
β β β βββ FarmerRegistrationPage.jsx
β β β βββ LoginPage.jsx
β β β βββ SignupPage.jsx
β β β
β β βββ components/ # Reusable components
β β β βββ Navbar.tsx
β β β βββ Footer.tsx
β β β βββ ImageUpload.jsx
β β β βββ GuidedTour.jsx
β β β
β β βββ services/ # API service layer
β β βββ ocrService.js # Backend API calls
β β
β βββ package.json # Node dependencies
β βββ vite.config.js # Vite configuration
β
βββ PPT.md # Comprehensive project presentation
βββ PROJECT_STATUS.md # Current status & issues
βββ OCR_ENHANCEMENT_GUIDE.md # Implementation guide
βββ README.md # This file
Solution:
# Make sure virtual environment is activated (you should see (venv))
pip install -r requirements.txtSolution:
- Go to https://console.cloud.google.com/apis/library/vision.googleapis.com
- Click "ENABLE"
- Enable billing (first 1000 requests are free)
- Wait 2 minutes, then try again
Solution:
- Open
backend/.env - Add line:
GOOGLE_VISION_API_KEY=your-actual-key-here - Save file
- Restart backend server
Solution:
# Windows - Kill process on port 5000
netstat -ano | findstr :5000
taskkill /PID <PID> /F
# Mac/Linux
lsof -ti:5000 | xargs kill -9Solution:
- Reinstall Node.js from nodejs.org
- Make sure to check "Add to PATH" during installation
- Restart PowerShell/Terminal
Solution:
- Ensure backend is running (check http://127.0.0.1:5000 in browser)
- Check CORS configuration in
backend/app.py - Try restarting both servers
Solution:
# Delete node_modules and reinstall
rm -rf node_modules package-lock.json
npm installSolution:
# Recreate database
cd backend
python
>>> from app import app, db
>>> with app.app_context():
... db.create_all()
>>> exit()python generate_disputed_lands_data.pyWe welcome contributions! Here's how:
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature-name
- Make your changes
- Commit with clear messages:
git commit -m "feat: Add PDF export functionality" - Push to your fork:
git push origin feature/your-feature-name
- Create a Pull Request on GitHub
feat:- New featurefix:- Bug fixdocs:- Documentation changesstyle:- Code formattingrefactor:- Code restructuringtest:- Adding testschore:- Maintenance tasks
This project is licensed under the MIT License - see the LICENSE file for details.
- GitHub Issues: Report bugs or request features
- Email: your-email@example.com
- Documentation: See
PPT.mdfor comprehensive project overview
- Google Cloud Vision API - OCR engine
- Google Gemini AI - Document analysis
- AI4Bharat - Indic language translation
- OpenStreetMap - Map data
- React Community - UI framework
- Flask Team - Backend framework
- Lines of Code: ~15,000+
- Languages: Python, TypeScript, JavaScript
- API Endpoints: 25+
- Database Tables: 5
- Supported Languages: 4 (Urdu, Hindi, Punjabi, English)
- Map Markers: Unlimited with clustering
- OCR processing with Google Vision API
- Multi-language translation
- Disputed lands management with map
- Farmer registration & dashboard
- PDF generation
- AI-powered summarization
- Mobile app (React Native)
- Offline OCR mode
- WhatsApp bot integration
- Blockchain-based land registry
- Drone boundary mapping
- Carbon credit integration
- Multilingual voice commands
- Always activate the virtual environment before running backend
- Use Google Vision API for Urdu/Hindi documents (better accuracy)
- Generate sample data to test disputed lands features
- Check backend logs if frontend shows errors
- Keep API keys secret - never commit
.envfiles to Git - Use VS Code with Python & ESLint extensions for best experience
Built with β€οΈ for farmers and land administrators across South Asia