Skip to content

udayDeveloper1/PdfEditor

 
 

Repository files navigation

PDF Editor

A full-stack PDF editor application with React Native frontend and Flask backend.

Features

Core PDF Operations

  • Upload PDF - Upload and store PDF files
  • View PDFs - Browse and view your PDF documents
  • Merge PDFs - Combine multiple PDF files into one
  • Split PDF - Extract specific pages from a PDF
  • Rotate Pages - Rotate pages in your PDF documents
  • Delete Documents - Remove PDF documents from the system
  • Rename Documents - Update document metadata and filenames

Advanced PDF Operations

  • Add Watermark - Add text watermarks with custom position and opacity
  • Encrypt PDF - Password-protect PDF files with encryption
  • Decrypt PDF - Remove password protection from PDFs
  • Compress PDF - Reduce file size with quality control (low, medium, high)
  • PDF to Images - Convert PDF pages to PNG/JPG images
  • Images to PDF - Create PDF from multiple image files
  • OCR Text Extraction - Extract text from scanned PDFs using OCR
  • Page Reordering - Reorder, duplicate, or remove specific pages
  • PDF Thumbnails - Generate preview thumbnails for PDF pages

Document Management

  • Search & Filter - Search documents by name and filter results
  • Sort Documents - Sort by name, size, pages, or date
  • Document Statistics - View total documents, storage used, and page counts
  • Metadata Editor - View and edit PDF metadata (title, author, subject, etc.)

User Management

  • User Authentication - Register and login with JWT tokens
  • User Accounts - Personal document library per user
  • Session Management - Secure session handling with Redis

Tech Stack

Backend

  • Flask - Python web framework
  • PostgreSQL - Database for storing document metadata
  • Redis - Caching, session management, and job queue
  • PyPDF2 - PDF manipulation library
  • PyMuPDF (fitz) - Advanced PDF operations and rendering
  • reportlab - PDF generation and watermarking
  • pikepdf - PDF encryption and security
  • pdf2image - PDF to image conversion
  • pytesseract - OCR text extraction
  • img2pdf - Image to PDF conversion
  • Flask-JWT-Extended - JWT authentication
  • Flask-Bcrypt - Password hashing
  • Celery - Background task processing

Frontend

  • React Native - Cross-platform mobile framework
  • Expo - Development platform
  • React Native Paper - Material Design components
  • React Navigation - Navigation library

Infrastructure

  • Docker - Containerization
  • Docker Compose - Multi-container orchestration

Prerequisites

  • Docker and Docker Compose
  • Node.js 18+ (for frontend development)
  • Python 3.11+ (for local backend development)
  • Tesseract OCR (for OCR functionality)
    • Windows: Download from GitHub
    • Mac: brew install tesseract
    • Linux: sudo apt-get install tesseract-ocr
  • Poppler (for PDF to image conversion)
    • Windows: Download from GitHub
    • Mac: brew install poppler
    • Linux: sudo apt-get install poppler-utils

Quick Start

1. Clone the repository

git clone <repository-url>
cd pdf_editor_project

2. Configure environment variables (optional)

Create a .env file in the backend directory:

# Database
DATABASE_URL=postgresql://postgres:postgres@db:5432/pdfeditor

# Redis
REDIS_HOST=redis
CELERY_BROKER_URL=redis://redis:6379/0
CELERY_RESULT_BACKEND=redis://redis:6379/0

# Security
JWT_SECRET_KEY=your-secret-key-change-in-production

# File Upload
MAX_CONTENT_LENGTH=104857600  # 100MB

3. Start the backend services with Docker

docker-compose up -d

This will start:

  • PostgreSQL database on port 5432
  • Redis on port 6379
  • Flask backend on port 5000

3. Verify backend is running

curl http://localhost:5000/health

You should see:

{
  "status": "healthy",
  "database": "connected",
  "redis": "connected"
}

4. Set up the frontend

cd frontend
npm install

5. Update API URL

Edit frontend/src/services/api.js and update the API_BASE_URL:

// For local development
const API_BASE_URL = 'http://localhost:5000/api';

// For Android emulator
// const API_BASE_URL = 'http://10.0.2.2:5000/api';

// For iOS simulator
// const API_BASE_URL = 'http://localhost:5000/api';

// For physical device (replace with your computer's IP)
// const API_BASE_URL = 'http://192.168.1.XXX:5000/api';

6. Start the frontend

npm start

Then:

  • Press w for web
  • Press a for Android
  • Press i for iOS
  • Scan QR code with Expo Go app on your phone

API Endpoints

Health Check

  • GET /health - Check backend health status

Authentication

  • POST /api/auth/register - Register a new user
  • POST /api/auth/login - Login user
  • GET /api/auth/me - Get current user info (requires JWT)

Documents

  • POST /api/upload - Upload a PDF file
  • GET /api/documents - List all documents (supports search, sort, filter)
  • GET /api/documents/<id> - Get document details
  • PUT /api/documents/<id> - Update document metadata (rename)
  • DELETE /api/documents/<id> - Delete a document
  • GET /api/documents/<id>/download - Download a document
  • GET /api/documents/<id>/thumbnail?page=1 - Get page thumbnail
  • GET /api/documents/stats - Get document statistics

Core Operations

  • POST /api/merge - Merge multiple PDFs
  • POST /api/split - Split a PDF by pages
  • POST /api/rotate - Rotate PDF pages
  • POST /api/reorder - Reorder PDF pages
  • GET /api/operations - List all operations
  • GET /api/operations/<id> - Get operation status

Advanced Operations

  • POST /api/watermark - Add watermark to PDF
  • POST /api/encrypt - Encrypt PDF with password
  • POST /api/decrypt - Decrypt PDF
  • POST /api/compress - Compress PDF file
  • POST /api/pdf-to-images - Convert PDF pages to images
  • POST /api/images-to-pdf - Convert images to PDF
  • POST /api/ocr - Extract text using OCR

Redis Usage

Redis is used for:

  1. Caching - Document metadata caching (1 hour TTL) ✅
  2. Session Management - JWT token management and user sessions ✅
  3. Job Queue - Celery broker for background task processing ✅
  4. Rate Limiting - API rate limiting (future feature)

Docker Commands

# Start all services
docker-compose up -d

# Stop all services
docker-compose down

# View logs
docker-compose logs -f backend

# Rebuild backend
docker-compose up -d --build backend

# Access database
docker exec -it pdf_editor_db psql -U postgres -d pdfeditor

# Access Redis CLI
docker exec -it pdf_editor_redis redis-cli

Development

Backend Development

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
python app.py

Frontend Development

cd frontend
npm start

Project Structure

pdf_editor_project/
├── backend/
│   ├── app.py              # Flask application
│   ├── requirements.txt    # Python dependencies
│   ├── Dockerfile          # Backend Docker image
│   └── .env.example        # Environment variables template
├── frontend/
│   ├── src/
│   │   ├── screens/        # React Native screens
│   │   └── services/       # API service layer
│   ├── App.js              # Main app component
│   ├── package.json        # Node dependencies
│   └── app.json            # Expo configuration
├── docker-compose.yml      # Docker orchestration
└── README.md               # This file

Troubleshooting

Backend not connecting to database

  • Ensure PostgreSQL container is healthy: docker-compose ps
  • Check logs: docker-compose logs db

Frontend can't reach backend

  • Verify backend is running: curl http://localhost:5000/health
  • Update API_BASE_URL in frontend/src/services/api.js
  • For Android emulator, use 10.0.2.2 instead of localhost
  • For physical device, use your computer's local IP address

Redis connection issues

  • Check Redis is running: docker-compose ps redis
  • Test connection: docker exec -it pdf_editor_redis redis-cli ping

Implemented Features ✅

  • Add watermark functionality
  • User authentication and authorization
  • PDF compression
  • OCR text extraction
  • PDF encryption/decryption
  • PDF to image conversion
  • Image to PDF conversion
  • Page reordering
  • PDF preview thumbnails
  • Document search and filtering
  • Document statistics dashboard
  • Metadata editor
  • Delete and rename documents

Future Enhancements

  • PDF annotation tools (highlights, comments, drawings)
  • Cloud storage integration (Google Drive, Dropbox, OneDrive)
  • Advanced batch processing with progress tracking
  • PDF comparison tool
  • Form filling and extraction
  • Digital signatures
  • PDF/A conversion for archiving
  • Advanced OCR with language selection
  • Collaborative editing
  • API rate limiting
  • Email PDF documents
  • Schedule automated PDF operations

Documentation

License

MIT

About

Build a custom pdf editor and run using docker

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • JavaScript 70.2%
  • Python 28.9%
  • Other 0.9%