Allow parameters in HTML backend or any DeclarativeDocumentBackend implementation

### Requested feature

Handling images that are referenced in HTML and adding them to the converted `DoclingDocument` has been requested by some users. We would like to have the flexibility to ignore, keep just the reference, or embed the images when the backend parses the document. We should have the possibility to pass parsing options to the backend. Currently, this is not possible, since the `init` method of all backends is restricted to the arguments `in_doc` and `path_or_stream`. 
We could find a solution that is either specific to this backend (e.g., through `HTMLFormatOption`) or generic to all the `DeclarativeDocumentBackend` implementations.

The #1411 initiated the reflection on this topic.

### Alternatives

An alternative would be to create several backend implementations for each option for handling images (_placeholder_, _referenced_, and _embedded_). The commit https://github.com/docling-project/docling/pull/1411/commits/5d08b749af67b07595f05819951e704d9e5b8ed4 points in this direction.

However, this should not be the preferred option, since it is not efficient, nor flexible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow parameters in HTML backend or any DeclarativeDocumentBackend implementation #1963

Requested feature

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow parameters in HTML backend or any DeclarativeDocumentBackend implementation #1963

Description

Requested feature

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions