-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug
Using granite docling, I can't extract images with a good resolution.
Steps to reproduce
Wrote a simple script that works on page 5 only of https://arxiv.org/pdf/2501.17887:
from pathlib import Path
from docling.document_converter import DocumentConverter
from docling.datamodel.base_models import InputFormat
from docling.document_converter import PdfFormatOption
from docling_core.types.doc import ImageRefMode
from docling.datamodel.pipeline_options import VlmPipelineOptions
from docling.pipeline.vlm_pipeline import VlmPipeline
from docling.datamodel.vlm_model_specs import GRANITEDOCLING_TRANSFORMERS as granite_docling_vlm_conversion_options
output_dir = Path("output")
output_dir.mkdir(exist_ok=True)
docling_paper = "arvix.pdf" # Only page 5 of https://arxiv.org/pdf/2501.17887
vlm_options = granite_docling_vlm_conversion_options
IMAGE_RESOLUTION_SCALE = 1.0
vlm_pipeline_options = VlmPipelineOptions(
force_backend_text=False,
vlm_options=vlm_options,
images_scale=IMAGE_RESOLUTION_SCALE,
generate_picture_images=True
)
converter_vlm = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(
pipeline_options=vlm_pipeline_options,
pipeline_cls=VlmPipeline,
)
}
)
vlm_result = converter_vlm.convert(docling_paper)
vlm_doc = vlm_result.document
html_filename = output_dir / f"{docling_paper}.html"
vlm_doc.save_as_html(html_filename, image_mode=ImageRefMode.REFERENCED)
With this, the images are low resolution (on top on being badly framed):

When pushing IMAGE_RESOLUTION_SCALE
to 2.0
, then we see this bug where the images are extracted from somewhere else in the document it looks like:

Docling version
2025-10-09 11:24:55,555 - INFO - Loading plugin 'docling_defaults'
2025-10-09 11:24:55,560 - INFO - Registered ocr engines: ['easyocr', 'ocrmac', 'rapidocr', 'tesserocr', 'tesseract']
Docling version: 2.55.1
Docling Core version: 2.48.4
Docling IBM Models version: 3.9.1
Docling Parse version: 4.5.0
Python: cpython-311 (3.11.10)
Platform: macOS-15.7.1-x86_64-i386-64bit
Python version
Python 3.11.10
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working