Skip to content

Conversation

@mikeSGman
Copy link

@mikeSGman mikeSGman commented Oct 15, 2025

Add PGS to SRT OCR Conversion Feature

Summary

This PR adds support for converting image-based PGS (Presentation Graphic Stream) subtitles to text-based SRT format using OCR (Optical Character Recognition). This feature enables users to extract Blu-ray subtitles as editable text files.

Motivation

PGS subtitles are image-based and cannot be edited or searched. Many users want to:

  • Edit subtitle timing or text content
  • Search subtitle content
  • Reduce file sizes (text SRT vs. image SUP)
  • Use subtitles with devices/players that don't support PGS

Features

User-Facing Changes

  1. Dropdown Menu for PGS Subtitles

    • PGS subtitle tracks now show a dropdown menu with two options:
      • "Extract as .sup (image - fast)" - Instant extraction of image-based subtitles
      • "Convert to .srt (OCR - 3-5 min)" - OCR conversion to text-based subtitles
  2. Settings Panel

    • New checkbox: "Enable PGS to SRT OCR conversion"
    • Dependency status display showing which OCR tools are installed
    • Link to installation instructions for missing dependencies
  3. Smart Dependency Detection

    • Auto-detects Tesseract OCR on all drives (C:, D:, E:, etc.)
    • Checks Windows registry for Tesseract installation
    • Auto-detects MKVToolNix (mkvmerge)
    • Auto-detects pgsrip Python package
    • Gracefully falls back to .sup extraction if dependencies are missing
  4. User-Friendly Error Messages

    • Clear instructions for installing missing dependencies
    • Platform-specific installation commands (Windows, Linux, macOS)
    • Links to download pages for manual installation

Technical Implementation

Files Modified

  1. fastflix/models/config.py

    • Added find_ocr_tool() function to locate Tesseract, mkvmerge, and pgsrip
    • Searches system PATH, environment variables, common install locations, and Windows registry
    • Added config fields: enable_pgs_ocr, tesseract_path, mkvmerge_path, pgsrip_path
  2. fastflix/widgets/background_tasks.py

    • Added _check_pgsrip_dependencies() method to verify all required tools
    • Added _convert_sup_to_srt() method to perform OCR conversion
    • Handles language code conversion (ISO 639-2/T 3-letter → ISO 639-1 2-letter)
    • Sets environment variables (PATH and TESSERACT_CMD) for pytesseract
    • Added use_ocr parameter to ExtractSubtitleSRT class
  3. fastflix/widgets/panels/subtitle_panel.py

    • Modified the Extract button for PGS tracks to show a dropdown menu
    • Conditionally enables OCR option based on settings and dependencies
    • Shows helpful tooltips when OCR option is disabled
  4. fastflix/widgets/settings.py

    • Added PGS OCR settings checkbox with tooltip
    • Added update_ocr_dependency_status() method to show dependency status
    • Displays checkmarks for installed dependencies
    • Shows link to wiki for installation instructions when dependencies are missing
  5. FastFlix_Windows_OneFile.spec

    • Added pgsrip and its dependencies to the PyInstaller build
    • Ensures OCR libraries are bundled in compiled Windows executable

Dependencies

  • Tesseract OCR - OCR engine for text recognition
  • MKVToolNix (mkvmerge) - Required by pgsrip for subtitle extraction
  • pgsrip - Python library for PGS subtitle OCR conversion
    • Automatically installs: pytesseract, opencv-python, numpy, pysrt, babelfish, cleanit

Key Design Decisions

  1. Two-Step Process: First extract .sup file using FFmpeg, then convert with pgsrip

    • Separates FFmpeg operations from OCR operations
    • Allows fallback to .sup extraction if OCR fails
    • Provides better error handling
  2. Language Code Conversion: Automatically converts ISO 639-2/T (eng) to ISO 639-1 (en)

    • pgsrip expects 2-letter language codes in filenames
    • Maintains compatibility with FastFlix's 3-letter language codes
  3. Environment Variable Management: Sets both TESSERACT_CMD and PATH

    • TESSERACT_CMD points to tesseract.exe
    • PATH includes Tesseract directory for subprocess calls
    • Fixes issue where pytesseract can't find Tesseract on Windows
  4. Automatic Cleanup: Deletes .sup file after successful .srt conversion

    • Keeps only the text-based .srt file
    • Reduces clutter in the output directory

Testing

Tested Scenarios

  • ✅ Windows with Tesseract on D: drive (non-standard location)
  • ✅ PGS subtitle extraction and OCR conversion
  • ✅ Language code conversion (eng → en)
  • ✅ Dependency detection and status display
  • ✅ Settings UI enable/disable functionality
  • ✅ Dropdown menu for PGS tracks
  • ✅ Fallback to .sup when dependencies are missing
  • ✅ Error handling and user-friendly messages

Test Results

  • Successfully converted Blade (1998) Blu-ray PGS subtitles to SRT
  • Conversion time: ~46 seconds for a feature film
  • Output: Clean .srt file with proper timing

Testing Checklist

  • Test on Windows with Tesseract in the default location (C:\Program Files)
  • Test on Linux with apt-installed dependencies
  • Test on macOS with Homebrew-installed dependencies
  • Test with Spanish, French, and other language subtitles
  • Test with missing dependencies (verify error messages)
  • Test dropdown menu behavior with OCR enabled/disabled
  • Test PyInstaller build includes pgsrip dependencies
  • Verify settings persist after restart

Installation Instructions for Users

Windows

# Install Tesseract OCR
# Download installer from: https://github.com/UB-Mannheim/tesseract/wiki

# Install MKVToolNix
# Download installer from: https://mkvtoolnix.download/downloads.html

# Install pgsrip (from FastFlix virtual environment)
pip install pgsrip

Linux

sudo apt install tesseract-ocr mkvtoolnix
pip install pgsrip

macOS

brew install tesseract mkvtoolnix
pip install pgsrip

Breaking Changes

None. This is a purely additive feature that's disabled by default.

Migration Guide

No migration needed. Existing users will see the new option after updating and installing dependencies.

Future Enhancements

Potential improvements for future PRs:

  • Support for more OCR languages via Tesseract language packs
  • Batch conversion of multiple PGS tracks
  • OCR accuracy tuning options
  • Progress bar for long OCR operations
  • Integration with online OCR services as a fallback

Related Issues

Closes #[issue-number] (if applicable)

Screenshots

image image

Checklist

  • Code follows project style guidelines
  • Comments added for complex logic
  • No debug/console.log statements
  • User-facing strings use translation function t()
  • Error handling for all external tool calls
  • Graceful degradation when dependencies are missing
  • Platform-specific code tested on Windows
  • Config fields have sensible defaults
  • Feature is opt-in (disabled by default)

  - Add dropdown menu for PGS subtitle tracks with OCR option
  - Auto-detect Tesseract OCR on all drives and Windows registry
  - Add settings panel with dependency status display
  - Support for converting image-based PGS to editable SRT
  - Handles language code conversion and environment setup
  - Includes comprehensive error handling and user guidance
@cdgriffith cdgriffith changed the base branch from master to develop October 18, 2025 18:50
Copy link
Owner

@cdgriffith cdgriffith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for this wonderful addition!

I have a few tweaks suggested. If you are up for doing them let me know, otherwise I can merge it and work on it as well as you set up a great feature I would love to add!

If you do add more please also run the pre-commit checks so it passes linting:

pre-commit install
pre-commit run --all-files

@mikeSGman
Copy link
Author

Thank you so much for this wonderful addition!

I have a few tweaks suggested. If you are up for doing them let me know, otherwise I can merge it and work on it as well as you set up a great feature I would love to add!

If you do add more please also run the pre-commit checks so it passes linting:

pre-commit install
pre-commit run --all-files

Ah, that's how you do it - thank you. Will do for future commits, and go back to see if I can do it on this PR also.

@mikeSGman mikeSGman force-pushed the feature/pgs-to-srt-ocr branch from c067e1f to cc88b50 Compare October 19, 2025 19:13
@mikeSGman
Copy link
Author

mikeSGman commented Oct 19, 2025

Hi @cdgriffith - My earlier commit missed the BabelLanguage patch for dual 2- and 3-letter ISO code handling. This push includes that fix; verified locally, and all pre-commit hooks are passing.

Based on our threaded convo, I think I've addressed all issues:

    - Use environment variables for Windows tool detection instead of
      scanning all drives (LOCALAPPDATA, PROGRAMFILES, PROGRAMFILES(X86))
    - Remove pgsrip_path config field and use pgsrip Python API directly
    - Update dependency checks to use importlib for pgsrip library
    - Fix BabelLanguage to handle both 2-letter and 3-letter ISO codes
    - Update error messages and installation instructions
image

- Use environment variables for Windows tool detection instead of
  scanning all drives (LOCALAPPDATA, PROGRAMFILES, PROGRAMFILES(X86))
- Remove pgsrip_path config field and use pgsrip Python API directly
- Update dependency checks to use importlib for pgsrip library
- Fix BabelLanguage to handle both 2-letter and 3-letter ISO codes
- Update error messages and installation instructions

All changes pass pre-commit linting checks.
@mikeSGman mikeSGman force-pushed the feature/pgs-to-srt-ocr branch from cc88b50 to 1c6c486 Compare October 19, 2025 19:37
The glob pattern was failing when filenames contained brackets like
[imdbid-tt0187738] because glob interprets [] as character classes.

Changed to detect newly created .srt files by comparing before/after
directory listings instead of using filename-based glob patterns.

Fixes false error for files like "Blade II (2002) [imdbid-tt0187738].mkv"
Include package metadata for pgsrip, pytesseract, and babelfish
in the Windows builds to fix 'No package metadata was found' error
when running OCR conversion from the compiled executable.
Add collect_data_files('babelfish') to bundle ISO language code
data files needed by babelfish at runtime.
Add copy_metadata('cleanit') for pgsrip dependency.
Add collect_data_files('cleanit') to bundle YAML config files
needed by cleanit at runtime.
Add copy_metadata('trakit') for pgsrip dependency.
@mikeSGman
Copy link
Author

@cdgriffith - Ok, I think we're finally there. The working screenshots I showed in #701 (comment) were based on running via Python in a Windows command prompt. After that, I realized we needed a bunch more work and tweaks to get it working OOB via the compiled binary. So, commits f5ddccc through 4f8e347 are just that. It works beautifully now. I think it's finally ready for your ACK/NACK. Sorry for all the noise, it's been a long time since I've done a PR - but I hope this brings some extra functionality and usefulness for someone out there. I, for one, use SRT subtitles alongside every MKV I stream via Jellyfin. Without them, it's a transcode every time I start a movie just to render the Blu-ray subtitles natively. HTH. YMMV.

image

Include pgsrip, pytesseract, babelfish, cleanit, trakit, opencv-python,
and pysrt in project dependencies to fix Windows build error where
PyInstaller's copy_metadata() could not find package metadata for
packages that weren't installed during the build process.
@mikeSGman
Copy link
Author

mikeSGman commented Oct 30, 2025

Summary of Changes in aacb011

Fixed Windows build error where PyInstaller couldn't find package metadata for OCR dependencies.

Changes:

  1. Added OCR dependencies to pyproject.toml:
  • pgsrip>=0.1.0
  • pytesseract>=0.3.0
  • babelfish>=0.6.0
  • cleanit>=0.4.0
  • trakit>=0.2.0
  • opencv-python>=4.8.0
  • pysrt>=1.1.0
  1. Updated uv.lock with the new dependencies
  2. Added WINDOWS_BUILD.md documentation for building on Windows

Why this fixes the build:

The spec files use copy_metadata() for these OCR packages, but they weren't being installed during the Windows build because they weren't
in pyproject.toml. The CI runs uv sync --frozen which only installs declared dependencies. Now these packages will be installed and their
metadata will be available to PyInstaller.

Include all babelfish.converters submodules (alpha2, alpha3b, alpha3t,
name, opensubtitles) in PyInstaller hidden imports to fix
'No module named babelfish.converters.alpha2' error during OCR conversion.
Add mkvtoolnix directory to PATH environment variable so pgsrip can
find mkvextract executable when performing OCR conversion. This fixes
the 'mkvextract command not found' error.
Change working directory to video folder and use relative filename
when calling pgsrip to avoid issues with special characters (parentheses,
brackets) in Windows paths that may cause mkvextract to fail.
@@ -1,5 +1,5 @@
# -*- mode: python ; coding: utf-8 -*-
from PyInstaller.utils.hooks import collect_submodules
from PyInstaller.utils.hooks import collect_submodules, copy_metadata, collect_data_files
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not know about those functions, handy!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just testing some final changes. I had to deal with detection for Subtitle Edit's tesseract installations. It works locally, testing a build now.

Check AppData/Roaming/Subtitle Edit for Tesseract installations,
parse version numbers from directory names (e.g., Tesseract550),
and automatically select the newest version. This ensures modern
Tesseract versions are detected even when multiple versions exist.
Initialize PATH environment variables for tesseract and mkvextract at
application startup before any subprocesses are spawned. This ensures
frozen PyInstaller executables can properly pass environment to
subprocesses spawned by pgsrip library.
Set TEMP and TMP environment variables to standard temp directory
to ensure pgsrip can create temporary folders correctly when running
from frozen PyInstaller executable.
Override pgsrip's temp folder creation to work correctly in frozen
PyInstaller executables. pgsrip's MediaPath.create_temp_folder()
doesn't work properly when frozen, so we create our own temp folder
if the one provided doesn't exist.
Ensure the monkey-patch is applied before importing Mkv class to
prevent pgsrip from capturing the original read_data method in its
lambda closures. This should fix PyInstaller temp folder issue.
Move the pgsrip monkey-patch to setup_ocr_environment() which runs
at application startup, before any pgsrip imports. This ensures the
patch is applied before pgsrip's lambda closures are created, fixing
temp folder creation in PyInstaller frozen executables.
Move patch_pgsrip_for_pyinstaller() to run AFTER environment variables
are set up, in case pgsrip import requires the environment to be
configured first.
Simplify code back to working state from source. PyInstaller exe issue
is a known pgsrip bug that needs to be fixed upstream. Feature works
perfectly when running from source.
Add documentation explaining that PGS to SRT OCR conversion works from
source but fails in PyInstaller builds due to pgsrip temp folder bug.
Include workaround instructions and requirements.
Implement OCR conversion for PGS (Presentation Graphic Stream) subtitles
to SRT format using pgsrip library with auto-detection of required tools.

Features:
- Auto-detect Tesseract OCR from PATH or Subtitle Edit installations
- Auto-detect MKVToolNix (mkvextract/mkvmerge) from standard locations
- Support for multiple language codes (2-letter, 3-letter, names)
- Automatic cleanup of temporary .sup files after conversion
- Works when running FastFlix from source

Known limitation:
Due to an upstream issue in pgsrip v0.1.12, this feature does not work
in PyInstaller-built executables. Users needing PGS OCR should run
FastFlix from source with: python -m fastflix

Dependencies added:
- pgsrip (OCR engine for PGS subtitles)
- pytesseract (Tesseract OCR Python wrapper)
- babelfish (language code handling)
- cleanit, trakit (metadata handling)
- opencv-python, pysrt (image/subtitle processing)
@mikeSGman mikeSGman force-pushed the feature/pgs-to-srt-ocr branch from a66fdfb to 2f89be5 Compare October 31, 2025 03:26
@mikeSGman
Copy link
Author

I give up, I simply can't figure out how to get it to work on a compiled binary (it compiles cleanly - but the srt extraction fails). It works perfectly from source, though, so that's good enough for my usecase.

@cdgriffith
Copy link
Owner

Hey @mikeSGman can you re-open this, I'd like to merge it to dev and play around with it to see if I can get the build working for ya! This is a great feature and would love to have it as part of the standard build

@mikeSGman
Copy link
Author

I tried to, but it says:
image

@mikeSGman
Copy link
Author

It might be because I squashed my commits in my source branch, but I still have the code if it's useful can give it to you.

@mikeSGman
Copy link
Author

GitHub won’t let me reopen this PR because the source branch history was rewritten. I opened a new PR with the same changes here: #709.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants