Skip to content

rlgrpe/ocr-raycast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR Screenshot Tool

A macOS tool that captures a screen region, runs OCR via the native Vision API, and copies the result to clipboard. Integrates with Raycast as a Script Command.

How It Works

  1. Press a hotkey in Raycast
  2. Select a screen area
  3. Text is recognized and copied to clipboard
  4. A sound plays to confirm

Requirements

  • macOS 13+ (Ventura or later)
  • Python 3.10+
  • Raycast

Installation

bash install.sh

This will:

  • Create a Python virtual environment
  • Install dependencies
  • Register a LaunchAgent (auto-starts the OCR server on login)
  • Wait for the server to be ready

Then in Raycast:

  1. Open Settings > Extensions > Script Commands
  2. Add this project directory
  3. Assign a hotkey to "OCR Screenshot"

Architecture

OCR Server (ocr_server.py) — FastAPI server on 127.0.0.1:19876 using macOS Vision API. Runs as a LaunchAgent, auto-restarts on crash.

Raycast Script (ocr.sh) — Captures a screen region via screencapture -i, sends it to the server, copies the result to clipboard, plays a sound.

Supported Languages

Ukrainian, Russian, English — detected automatically by the Vision framework.

Manual Usage

# Start the server
.venv/bin/python ocr_server.py

# OCR a file
curl -s -X POST http://127.0.0.1:19876/ocr -F "file=@screenshot.png"

# Health check
curl http://127.0.0.1:19876/health

Uninstall

launchctl bootout "gui/$(id -u)/com.ocr.server"
rm ~/Library/LaunchAgents/com.ocr.server.plist

About

macOS OCR screenshot tool with Raycast integration. Uses Vision API for fast multilingual text recognition (uk, ru, en).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors