Skip to content

Allow a persistence volume for model weights #47

@dolfim-ibm

Description

@dolfim-ibm

Problem description

Docling requires AI models for converting the documents. Which models is only known at runtime and it might be different for each API request.
In Docling Serve we took the approach of storing the model common model weights in the container image, but this creates issues with users who request the features provided by the extra models.

Proposed solution

To allow users to download all the required models, and to keep them "cached" in the cluster, we suggest adding the optional possibility to:

  1. add a persistent volume to the deployment
  2. add an initContainers which downloads the requested models in the volume

For completeness, we could give the users 3 choices:

  • Option 1: No volume. Just use what is there by default
  • Option 2: Request the operator to provision a pvc and specify which models are requested
  • Option 3: Provide pointers to an existing pvc and specify which models are requested

Proposed specs

kind: DoclingServe
spec:
  artifactsVolume:
    enable: false | true  # default false
    models: ""  # a space separated list like"layout tableformer code_formula picture_classifier smolvlm granite_vision easyocr"
    volumeClaimTemplates:  # optional
      - {}  # the specs for a volumeClaim (see below examples)

The default pvc settings should request for 16GB storage and accessMode: ReadWriteMany.

Inspired by other operators (e.g. elasticsearch) Users could provide the volumeClaim details for advanced changes, for example:

  • A different size or storage class:

    volumeClaimTemplates:
      - metadata:
        name: docling-models
        spec:
          accessModes:
            - ReadWriteMany
          resources:
            requests:
              storage: 20Gi
          storageClassName: standard
  • Point to an existing volume:

    volumeClaimTemplates:
      - metadata:
        name: docling-models
        spec:
          storageClassName: ""
          volumeName: foo-pv    

Changes in the Deployment

  1. Mount the volume, e.g. in /opt/docling/models (just a proposal)

  2. Set the ENV DOCLING_SERVER_ARTIFACTS_PATH=/opt/docling/models (the actual mount point)

  3. Add a initContainers using the same image but a different command:

    initContainers:
      - name: docling-models-download
        image: <spec.apiServer.image>
        command: ['sh', '-c', "docling-tools models download -o <mountPoint> <spec.artifactsVolume.models>"]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions