Problem description
Docling requires AI models for converting the documents. Which models is only known at runtime and it might be different for each API request.
In Docling Serve we took the approach of storing the model common model weights in the container image, but this creates issues with users who request the features provided by the extra models.
Proposed solution
To allow users to download all the required models, and to keep them "cached" in the cluster, we suggest adding the optional possibility to:
- add a persistent volume to the deployment
- add an
initContainers which downloads the requested models in the volume
For completeness, we could give the users 3 choices:
- Option 1: No volume. Just use what is there by default
- Option 2: Request the operator to provision a pvc and specify which models are requested
- Option 3: Provide pointers to an existing pvc and specify which models are requested
Proposed specs
kind: DoclingServe
spec:
artifactsVolume:
enable: false | true # default false
models: "" # a space separated list like"layout tableformer code_formula picture_classifier smolvlm granite_vision easyocr"
volumeClaimTemplates: # optional
- {} # the specs for a volumeClaim (see below examples)
The default pvc settings should request for 16GB storage and accessMode: ReadWriteMany.
Inspired by other operators (e.g. elasticsearch) Users could provide the volumeClaim details for advanced changes, for example:
Changes in the Deployment
-
Mount the volume, e.g. in /opt/docling/models (just a proposal)
-
Set the ENV DOCLING_SERVER_ARTIFACTS_PATH=/opt/docling/models (the actual mount point)
-
Add a initContainers using the same image but a different command:
initContainers:
- name: docling-models-download
image: <spec.apiServer.image>
command: ['sh', '-c', "docling-tools models download -o <mountPoint> <spec.artifactsVolume.models>"]
Problem description
Docling requires AI models for converting the documents. Which models is only known at runtime and it might be different for each API request.
In Docling Serve we took the approach of storing the model common model weights in the container image, but this creates issues with users who request the features provided by the extra models.
Proposed solution
To allow users to download all the required models, and to keep them "cached" in the cluster, we suggest adding the optional possibility to:
initContainerswhich downloads the requested models in the volumeFor completeness, we could give the users 3 choices:
Proposed specs
The default pvc settings should request for 16GB storage and
accessMode: ReadWriteMany.Inspired by other operators (e.g. elasticsearch) Users could provide the
volumeClaimdetails for advanced changes, for example:A different size or storage class:
Point to an existing volume:
Changes in the Deployment
Mount the volume, e.g. in
/opt/docling/models(just a proposal)Set the ENV
DOCLING_SERVER_ARTIFACTS_PATH=/opt/docling/models(the actual mount point)Add a
initContainersusing the same image but a differentcommand: