Skip to content

FITS runs Tika, which runs Tesseract, which is very slow #25

@escowles

Description

@escowles

Related to #18 — we found that 75% or more of the time to run FITS on our 100MB TIFF files was spent running Tesseract (run by Tika). We disabled Tika by commenting out the TikaTool line in the /path/to/fits/xml/fits.xml configuration file, and saw dramatically faster FITS execution times (20 seconds per file instead of 90+).

We updated our Ansible playbook to comment out the Tika line when we install FITS: ucsdlib/ansible-role-fits#2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog
    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions