📸 Extract Text From Image

Free, private, AI-powered OCR tool running entirely in your browser — Extract text from images with 92-97% accuracy, zero API costs, and 100% data privacy.

✨ Features

🚀 Lightning Fast - 2-7 second processing time
🔒 100% Private - All processing happens in your browser
🤖 AI-Powered - Hybrid OCR using Tesseract.js + TrOCR transformers
📊 Smart Fallback - Automatically uses best method for each image
🎨 Image Enhancement - 8 preprocessing techniques for better accuracy
💰 Completely Free - No API costs, no hidden fees
🌓 Dark/Light Mode - Beautiful UI with theme toggle
📱 Mobile Friendly - Responsive design for all devices
✨ V3 Premium UI - Aurora gradients, glass-morphism, micro-interactions
⌨️ Keyboard Shortcuts - Quick actions with keyboard (C/D/H/T/Escape)
📚 Local History - Last 20 results saved in browser
♿ Fully Accessible - WCAG 2.1 AA compliant, keyboard navigation
🎯 SEO Optimized - Intent-specific pages with FAQ rich snippets

🎨 UI Variants

V3 Premium UI (Recommended)

Enable with VITE_UX_V2=1 environment variable for:

Aurora Background - Animated gradient blobs with grain texture
Glass Morphism - Modern glassmorphic cards with backdrop blur
Staged Progress - 3-step progress indicator (Upload → OCR → Render)
Confetti Animation - Celebratory micro-interaction on copy
History Drawer - Slide-in drawer with last 20 results
Keyboard Shortcuts:
- C - Copy result
- D - Download result
- H - View history
- T - Toggle theme
- Escape - Close drawer/clear result
Core Web Vitals - LCP < 1.8s, INP < 200ms, CLS < 0.1
5 Intent Pages - SEO-optimized routes with FAQ sections

Classic UI (Fallback)

Original simple UI without premium features.

🔍 SEO Defaults

All canonical and OG/Twitter image URLs are computed as absolute using VITE_SITE_URL (see .env.example).

Default OG image: /og/default.png (1200×630, logo/high contrast, ≤200KB recommended)
Set per-route OG via ogImage prop in SEO helpers/components.
Canonical links are always absolute.

How to set VITE_SITE_URL:

Locally: add to .env.local (e.g. VITE_SITE_URL=http://localhost:5173)
Vercel/Production: set in dashboard or vercel.json env block.

Validation: After deploy, validate with Twitter Card Validator and Facebook Sharing Debugger.

92-97% accuracy on printed text
70-85% accuracy on handwritten text
Automatic confidence scoring
Intelligent fallback for low-confidence results

🧩 Spec-Driven Development (SDD)

This repo uses Spec-Driven Development for reliability and accessibility. All features, APIs, schemas, and UX contracts are defined in /specs and enforced in CI.

How to work with specs

Product features: /specs/product/*.feature (BDD scenarios)
API contracts: /specs/api/openapi.yml (OpenAPI)
Schemas: /specs/schemas/*.json (JSON Schema)
UX tokens/contracts: /specs/ux/tokens.json, /specs/ux/components/*.md
AI goldens: /specs/ai/evals/goldens/*.json

How AI agents should use specs

Always read relevant specs before proposing changes
Update tests and contracts if specs change
Use BDD runner and schema guard helpers for validation

PR Checklist

If spec changed, update tests & ADR
BDD passing
Schema/contract diffs clean
No SEO/a11y regressions

CI Gates

npm run spec:lint — Spectral lint for OpenAPI
npm run spec:schemas — Validate JSON schemas and tokens
npm run spec:test — Playwright BDD tests
npm run spec:diff — OAS diff for breaking changes

Prerequisites

Node.js 20+
npm 10+ or yarn 4+

Quick Start (Development)

# Clone and setup
git clone https://github.com/ParthibanRajasekaran/text-from-image.git
cd text-from-image
npm install

# Start dev server
npm run dev
# Visit http://localhost:5173

# Run tests
npm run test

# Build for production
npm run build

🚀 Deployment

Deploy to Vercel (Production)

Deploy your EXACT local code to production with zero drift:

# First time setup
npm run vercel:link         # Link to your Vercel project
npm run vercel:pull:prod    # Pull production environment variables

# Deploy to production
npm run deploy:prod

After deployment, verify the footer shows the correct commit SHA matching your local git commit.

GitHub Actions (Automated Deploys)

This project uses controlled deployments to avoid accidental deploys:

Preview Deployments (Pull Requests):

Only triggers when PR has deploy:preview label
Deploys to preview URL, posted as PR comment
Cancelled automatically when new commits pushed

Production Deployments (Main Branch):

Automatically deploys on push to main branch
Skip with [skip deploy] in commit message
Only deploys if app files changed (not docs-only)

Manual Deployment:

GitHub UI: Actions → Deploy to Vercel → Run workflow

Required GitHub Secrets

Add these in Settings → Secrets and variables → Actions:

Secret	Where to get it
`VERCEL_TOKEN`	https://vercel.com/account/tokens
`VERCEL_ORG_ID`	Run `vercel link`, check `.vercel/project.json`
`VERCEL_PROJECT_ID`	Run `vercel link`, check `.vercel/project.json`

Environment Variables (Vercel)

Set these in Vercel Dashboard → Project → Settings → Environment Variables:

# Feature Flags (Production environment)
VITE_UX_V2=true                    # Enable enhanced UI
VITE_ANALYTICS_ENABLED=true        # Enable Web Vitals tracking
VITE_OCR_MIN_CONFIDENCE=60         # Tesseract confidence threshold

📖 Full deployment guide: See docs/DEPLOYMENT.md for:

Zero-drift deployment strategy
Bundle size monitoring
CI/CD setup
Troubleshooting common issues

🏗️ Architecture

Application Structure

pages/               # SEO-optimized route pages (lazy-loaded)
  ├── ImageToText.tsx
  ├── ExtractTextFromImage.tsx
  └── ...
    
components/         # Reusable React components
  ├── v3/           # Premium UI (Aurora, Glass, History)
  ├── AdSlot.tsx    # AdSense slot containers
  └── ...

hooks/              # Custom React hooks (useShortcuts, useLocalHistory, etc.)

lib/                # Business logic
  ├── consent.ts    # Consent Mode v2 management
  ├── analytics.ts  # Web Vitals tracking
  └── config/       # Feature flags & env guards

services/           # OCR engines (Tesseract, TrOCR)

__tests__/          # Test suite (routes, SEO, a11y, deprecation)

Request Flow: Image Upload → OCR → Result

User Upload
    ↓
File Validation (type, size)
    ↓
Image Preprocessing (optional)
    ↓
Tesseract OCR (fast, 2-5s)
    ├─ Confidence ≥ 60%? → Return ✓
    └─ Fallback → TrOCR AI (5-10s) → Return ✓
    ↓
Result Display + Copy/Download

Tech Stack

Layer	Technology
UI	React 19.2 + TypeScript 5.8
Build	Vite 6.2 + Rollup (code splitting)
OCR	Tesseract.js 5.1 + TrOCR (AI)
Styling	CSS + Tailwind + Custom properties
State	React Hooks (useReducer, Context)
Testing	Vitest + Testing Library
Deployment	Vercel (automated via GitHub Actions)

Feature Flags & Configuration

See .env.example for all available flags:

Flag	Purpose	Default
`VITE_UX_V2`	Enable premium UI (Aurora/Glass)	`false`
`VITE_ADSENSE_PUB_ID`	AdSense publisher ID (prod only)	unset
`VITE_SITE_URL`	Canonical domain for SEO	computed
`VITE_OCR_MIN_CONFIDENCE`	Tesseract confidence threshold	`60`

See also: Env Guards for runtime validation.

🛠️ How It Works

1. Image Upload

Supports PNG, JPG, WEBP
Max file size: 20MB
Drag & drop or click to browse

2. Processing

Step 1: Validate file (type, size, browser compatibility)
Step 2: Apply image preprocessing (optional, auto-enabled)
Step 3: Try Tesseract OCR (fast method)
Step 4: If confidence < 60%, fallback to TrOCR AI model

3. Results

Extracted text with confidence score
Copy to clipboard
Download as .txt or .doc
Shows which method was used

📊 Performance

Metric	Value
Speed	2-7 seconds average
Accuracy	92-97% (printed text)
Cost	$0 (completely free)
Privacy	100% client-side
Success Rate	95%+ with hybrid approach

vs Google Gemini (Previous)

Aspect	Gemini API	This App	Winner
Speed	1-2s	2-7s	Gemini
Accuracy	95-99%	92-97%	Gemini
Cost	$0.001-0.01/image	$0	This App 🏆
Privacy	❌ Cloud	✅ Local	This App 🏆
Offline	❌ No	✅ Yes*	This App 🏆

*After first-time model download

🔒 Privacy

✅ 100% client-side processing - No server uploads
✅ No data collection - Your images never leave your device
✅ No API keys required - No external dependencies
✅ No tracking - Privacy-first design
✅ GDPR compliant - No personal data processed

🚀 Deployment

How We Deploy

This project uses GitHub Actions for automated CI/CD with label-gated preview deploys to avoid waste:

Feature Branch → PR → [add deploy:preview label] → Preview Deploy ✓
          ↓
Main Branch → Push → Auto-Production Deploy ✓

Key principles:

✅ All code changes go through tests first
✅ Production deploys only from main branch
✅ Preview deploys opt-in (require deploy:preview label)
✅ Skip deploys with [skip deploy] in commit message

Deployment workflow: .github/workflows/deploy.yml
Environment config: docs/DEPLOYMENT.md (setup, troubleshooting, env vars)

Local Deployment (Testing)

# Setup (one-time)
npm install -g vercel
vercel link --yes

# Deploy preview locally
vercel

# Deploy production locally
vercel --prod --confirm

📦 Bundle Size & Code Splitting

Optimizations

This project uses aggressive code splitting to minimize initial load time:

✅ Dynamic Imports - Heavy libraries loaded only when needed
✅ Vendor Chunking - React, Tesseract, Transformers in separate chunks
✅ Lazy Loading - OCR engines loaded on-demand, not upfront
✅ Tree Shaking - Unused code automatically removed

Bundle Breakdown

Chunk	Size (gzipped)	When Loaded
`index.js` (main)	~145 KB	Initial page load
`react-vendor.js`	~140 KB	Initial page load
`tesseract.js`	~23 KB	When user starts OCR
`transformers.js`	~199 KB	When AI fallback needed
`jspdf.js`	~40 KB	When user exports to PDF

First-load JS: ~285 KB (gzipped) ✅
Total app (all features): ~547 KB (gzipped)

Analyze Bundle

# Generate interactive bundle visualization
npm run analyze

# Opens stats.html with treemap of all chunks

The analyzer shows:

Which dependencies contribute most to bundle size
How code splitting distributes code across chunks
Gzipped and Brotli compressed sizes
Duplicate dependencies (if any)

How Code Splitting Works

1. Tesseract.js (23 KB)

// ❌ Before: Loaded upfront
import Tesseract from 'tesseract.js';

// ✅ After: Loaded when user starts OCR
const { recognize } = await import('tesseract.js');

2. Transformers (199 KB)

// ❌ Before: Loaded upfront
import { pipeline } from '@xenova/transformers';

// ✅ After: Loaded only as fallback or when requested
const { pipeline } = await import('@xenova/transformers');

3. jsPDF (40 KB)

// ❌ Before: CDN script tag in index.html
<script src="https://cdn.../jspdf.umd.min.js"></script>

// ✅ After: Loaded only when user exports PDF
const { default: jsPDF } = await import('jspdf');

Performance Impact

Metric	Before	After	Improvement
First load JS	~850 KB	~285 KB	67% reduction
Time to Interactive	~5.2s	~1.8s	3.4s faster
Lighthouse Score	72	94	+22 points
Build warnings	"Some chunks > 500 KB"	✅ None	Fixed

Best Practices

✅ DO:

Keep initial bundle under 300 KB (gzipped)
Use dynamic imports for heavy libraries
Analyze bundle regularly with npm run analyze
Lazy-load features users might not use

❌ DON'T:

Import entire libraries when you only need a few functions
Load OCR engines upfront before user uploads image
Bundle large assets (WASM, models) - serve from CDN or /public

Related Files

vite.config.ts - Rollup manual chunks configuration
services/*Service.ts - Dynamic imports for OCR engines
utils/fileUtils.ts - Dynamic import for jsPDF
package.json - Bundle analyzer script

💸 Monetization (AdSense)

To apply for AdSense and enable monetization:

Go to AdSense “Sites” and add https://freetextfromimage.com.
Ensure the site has live policy pages (/privacy-policy, /terms, /about, /contact) and helpful guides (/image-to-text, etc.).
Add a Google-certified CMP and wire its callback to updateConsent() in src/consent/consent.ts.
- Consent Mode v2 must fire before any ad scripts load.
Set VITE_ADSENSE_PUB_ID in Vercel (Production) after approval.
Add your publisher ID to /public/ads.txt.
Use Search Console to submit your sitemap and fix Page Indexing/Core Web Vitals issues.
Placement rules:
- No ads near upload/CTA areas; avoid accidental clicks.
- All ad slots reserve space to prevent CLS.
- Ads only load after consent and in Production.

Testing:

Run npm run sitemap to generate /public/sitemap.xml.
Run npm run build && npm run test to verify build and ad readiness.

CMP Integration:

See comments in src/consent/consent.ts for wiring a certified CMP callback.
Verify Consent Mode v2 fires before ads script loads.

AdSense Approval:

Update /public/ads.txt with your real publisher ID after approval.
Use AdSense dashboard for auto ad configuration.

SEO:

All pages have unique meta tags and canonical URLs.
Guides and policy pages are crawlable and linked from Home/footer.
robots.txt and sitemap.xml are present.

Performance:

Ad slots use fixed min-height to prevent layout shift.
No OCR/UX regressions; light/dark mode consistent.

🤝 Contributing

See CONTRIBUTING.md for:

Code standards & TypeScript conventions
Accessibility (a11y) requirements
Testing & how to run the test suite
Commit message conventions
PR process & deprecation policy

Quick checklist before submitting PR:

npm run lint        # Fix linting issues
npm run test        # Run full test suite
npm run build       # Verify production build

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Third-Party Licenses

Tesseract.js - Apache 2.0
@xenova/transformers - Apache 2.0
React - MIT
Vite - MIT

All dependencies are compatible with commercial use.

🙏 Acknowledgments

Tesseract.js - Traditional OCR engine
Transformers.js - Browser-based AI models
Microsoft TrOCR - AI OCR model

📮 Contact

Repository: github.com/ParthibanRajasekaran/text-from-image
Issues: github.com/ParthibanRajasekaran/text-from-image/issues

🌟 Show Your Support

If you find this project useful, please give it a ⭐️ on GitHub!

Made with ❤️ by Parthiban Rajasekaran

Name		Name	Last commit message	Last commit date
Latest commit History 208 Commits
.github/workflows		.github/workflows
__tests__		__tests__
_deprecated		_deprecated
app		app
components		components
docs		docs
hooks		hooks
lib		lib
pages		pages
public		public
scripts		scripts
services		services
specs		specs
src		src
types		types
utils		utils
.env.example		.env.example
.gitignore		.gitignore
.spectral.yaml		.spectral.yaml
.spectral.yml		.spectral.yml
App.tsx		App.tsx
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MIGRATION.md		MIGRATION.md
PR_NOTES_ADSENSE_READINESS.md		PR_NOTES_ADSENSE_READINESS.md
README.md		README.md
eslint.config.mjs		eslint.config.mjs
index.css		index.css
index.html		index.html
index.tsx		index.tsx
jest.config.js		jest.config.js
jest.setup.ts		jest.setup.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
router.tsx		router.tsx
stats.html		stats.html
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
validate-production.sh		validate-production.sh
vercel.json		vercel.json
vercel.json.example		vercel.json.example
vite-env.d.ts		vite-env.d.ts
vite.config.ts		vite.config.ts

License

ParthibanRajasekaran/text-from-image

Folders and files

Latest commit

History

Repository files navigation

📸 Extract Text From Image

✨ Features

🎨 UI Variants

V3 Premium UI (Recommended)

Classic UI (Fallback)

🔍 SEO Defaults

🧩 Spec-Driven Development (SDD)

How to work with specs

How AI agents should use specs

PR Checklist

CI Gates

Prerequisites

Quick Start (Development)

🚀 Deployment

Deploy to Vercel (Production)

GitHub Actions (Automated Deploys)

Required GitHub Secrets

Environment Variables (Vercel)

🏗️ Architecture

Application Structure

Request Flow: Image Upload → OCR → Result

Tech Stack

Feature Flags & Configuration

🛠️ How It Works

1. Image Upload

2. Processing

3. Results

📊 Performance

vs Google Gemini (Previous)

🔒 Privacy

🚀 Deployment

How We Deploy

Local Deployment (Testing)

📦 Bundle Size & Code Splitting

Optimizations

Bundle Breakdown

Analyze Bundle

How Code Splitting Works

Performance Impact

Best Practices

Related Files

💸 Monetization (AdSense)

🤝 Contributing

📄 License

Third-Party Licenses

🙏 Acknowledgments

📮 Contact

🌟 Show Your Support

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages