Skip to content

Regression: Can no longer decode Paradise_Lost.pdf #26405

@nico

Description

@nico

This file from Google Books: Paradise_Lost.pdf

We used to be able to render it. Now we instead error out with

warning: Internal error while processing PDF file: JBIG2ImageDecoderPlugin: Segment refers to dead segment

The diagnostic is right and I think the file is wrong, segment 15 refers to segment 14, but that doesn't have its "Retained?" bit set:

Segment number: 14
Segment type: 0
Page association size is 32 bits: false
Page retained only by itself and extension segments: false
Retained: false
Referred-to segment count: 0
Segment page association: 1
Segment data length: 53921

Segment number: 15
Segment type: 6
Page association size is 32 bits: false
Page retained only by itself and extension segments: false
Retained: false
Referred-to segment count: 2
Referred-to segment number: 0, retained true
Referred-to segment number: 14, retained false
Segment page association: 1
Segment data length: 3946

But we should display it anyways, of course.

I think the things to do are:

  1. Add an enum to JBIG2Loader that controls how strict we are, and set it to less strict by default and request more strict in jbig2-from-json
  2. Check if jbig2enc (which Google Books uses) still gets this wrong, and if so file an upstream issue about it

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions