Skip to content

Add Deskew Mode Using Top Page Edge as Reference for Book Scanning #210

@hbenitezs

Description

@hbenitezs

Feature Request: Deskew Mode Based on Top Page Edge

Hello ScanTailor Advanced team,

I am part of a team processing scanned books for a university library. We currently use ScanTailor Advanced to process book scans, but we face an issue with the deskew functionality. The current deskew algorithm relies on internal page content (e.g., text lines or other features) to determine the rotation angle, which often leads to inconsistent results for pages with irregular content, such as illustrations, sparse text, or damaged pages.

We would like to request a new deskew mode that uses only the top edge of the page as the reference for determining the rotation angle, ignoring internal content. This would ensure that the page is aligned based on its physical border, which is critical for our book scanning workflow, as our scans typically have a clear top edge with a dark background.

Proposed Functionality:

  • Detect the top edge of the page (e.g., by identifying the transition from the dark background to the lighter page content using edge detection or binarization).
  • Fit a line to the detected top edge (e.g., using linear regression) to calculate the skew angle.
  • Rotate the page to make the top edge horizontal, ignoring any internal content like text or lines.
  • Add an option in the Deskew settings (e.g., a checkbox or dropdown) to select between the current content-based deskew and the new top-edge-based deskew.
  • If no clear top edge is detected, fall back to the current algorithm or allow manual adjustment.

Use Case:
This feature would be particularly useful for book scanning projects where pages have consistent physical borders but varied content (e.g., books with illustrations, mixed text/image pages, or older books with irregular text alignment). In our case, we process scans for a university library contract, and aligning pages based on their top edge would greatly improve consistency and quality.

Kind regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions