Skip to content

support ALTO 4.3 #30

@bertsky

Description

@bertsky

New features:

  1. Add BASEDIRECTION attribute defining base direction and line orientation to TextLine and BlockType.
  2. Add support for explicit reading order definitions with "ReadingOrder" element containing "UnorderedGroup"s, "OrderedGroup"s, and "ElementRef"s.

Regarding @BASEDIRECTION the docs state:

Describes the inline base direction and line orientation of a line or of all lines inside a text block.
The meaning of these terms is defined by the W3C writing modes document
These values should correspond to the base direction set in the BiDi algorithm to the respective elements during Unicode encoding. A value of "ttb" (top-to-bottom) implies a base direction of left-to-right, a value of "btt" (bottom-to-top) a base direction of right-to-left.

  • ltr
  • rtl
  • ttb
  • btt

It sounds a lot like @readingDirection in PAGE, but there is no mention of bidirectionality here. @chris1010010, can you help?

As to ReadingOrder, that has been directly adopted from PAGE, with subtle differences though:

  • in ALTO ReadingOrder can have any number of groups (with alternative semantics), in PAGE it must have exactly one
  • the syntactic clutter is minimized, i.e. no Indexed variants and no explicit @index. OrderedGroup simply has sequence semantics and UnorderedGroup set semantics, otherwise they are the same and appear in the same places.
  • @REF is explicitly allowed for sub-region elements, which is allowed in PAGE syntactically but forbidden by documentation (and I suppose PRImA's libraries won't tolerate such use)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions