Skip to content

Conversation

@mittagessen
Copy link
Owner

@mittagessen mittagessen commented Nov 3, 2025

The current API is rather specific to particular implementations of methods and keeps us from easily integrating methods that are not one to one replacements with similar semantics as current ones. This pull request includes a major rework of kraken solving a number of long-standing issues:

  • support for safetensors-based serialization to allow arbitrary network architectures
  • using ensembles of models for a particular task
  • using entry points to register models, weight loaders and writers, and tasks
  • modular implementations for individual tasks (segmentation, recognition, forced alignment)
  • configuration/parametrization with proper classes instead of unstructured dictionaries
  • distinction between checkpoints (in pytorch pickles) and weights files (in coreml or safetensors)

The new API also gets rid of certain rarely used functionality, such as tag-based recognition, that have blocked the implementation of performance improvements.

A segmentation workflow using the new API:

from PIL import Image
from kraken.tasks import SegmentationTaskModel
from kraken.configs import SegmentationInferenceConfig

segmenter = SegmentationTaskModel.load_model('/path/to/segmentation/models.safetensors')
im = Image.open('sample.jpg')
seg = segmenter.predict(im=im, config=SegmentationInferenceConfig())

and recognition:

from kraken.tasks import RecognitionTaskModel
from kraken.configs import RecognitionInferenceConfig

recognizer = RecognitionTaskModel.load_model('/path/to/recognition/model.safetensors')
for record in recognizer.predict(im=im, segmentation=seg, config=RecognitionInferenceConfig(batch_size=8, num_line_workers=4)):
    print(record)

The new recognition API boosts inference speeds on CPU by roughly ~80% through parallelization and allows for efficient use of GPU resources.

and make greedy_decoder batch-capable/compatible with tensors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants