Skip to content

Granite Docling extract images with higher resolution #2416

@paoloaveri

Description

@paoloaveri

Bug

Using granite docling, I can't extract images with a good resolution.

Steps to reproduce

Wrote a simple script that works on page 5 only of https://arxiv.org/pdf/2501.17887:

from pathlib import Path
from docling.document_converter import DocumentConverter
from docling.datamodel.base_models import InputFormat
from docling.document_converter import PdfFormatOption
from docling_core.types.doc import ImageRefMode
from docling.datamodel.pipeline_options import VlmPipelineOptions
from docling.pipeline.vlm_pipeline import VlmPipeline
from docling.datamodel.vlm_model_specs import GRANITEDOCLING_TRANSFORMERS as granite_docling_vlm_conversion_options

output_dir = Path("output")
output_dir.mkdir(exist_ok=True)

docling_paper = "arvix.pdf" # Only page 5 of https://arxiv.org/pdf/2501.17887

vlm_options = granite_docling_vlm_conversion_options
IMAGE_RESOLUTION_SCALE = 1.0

vlm_pipeline_options = VlmPipelineOptions(
    force_backend_text=False,
    vlm_options=vlm_options,
    images_scale=IMAGE_RESOLUTION_SCALE,
    generate_picture_images=True
)

converter_vlm = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(
            pipeline_options=vlm_pipeline_options,
            pipeline_cls=VlmPipeline,
        )
    }
)

vlm_result = converter_vlm.convert(docling_paper)
vlm_doc = vlm_result.document
html_filename = output_dir / f"{docling_paper}.html"
vlm_doc.save_as_html(html_filename, image_mode=ImageRefMode.REFERENCED)

With this, the images are low resolution (on top on being badly framed):

Image

When pushing IMAGE_RESOLUTION_SCALE to 2.0, then we see this bug where the images are extracted from somewhere else in the document it looks like:

Image

Docling version

2025-10-09 11:24:55,555 - INFO - Loading plugin 'docling_defaults'
2025-10-09 11:24:55,560 - INFO - Registered ocr engines: ['easyocr', 'ocrmac', 'rapidocr', 'tesserocr', 'tesseract']
Docling version: 2.55.1
Docling Core version: 2.48.4
Docling IBM Models version: 3.9.1
Docling Parse version: 4.5.0
Python: cpython-311 (3.11.10)
Platform: macOS-15.7.1-x86_64-i386-64bit

Python version

Python 3.11.10

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions