Allow a persistence volume for model weights

## Problem description

Docling requires AI models for converting the documents. Which models is only known at runtime and it might be different for each API request.
In Docling Serve we took the approach of storing the model common model weights in the container image, but this creates issues with users who request the features provided by the extra models.


## Proposed solution

To allow users to download all the required models, and to keep them "cached" in the cluster, we suggest adding the optional possibility to:
 
1. add a persistent volume to the deployment
2. add an `initContainers` which downloads the requested models in the volume

For completeness, we could give the users 3 choices:
- Option 1: No volume. Just use what is there by default
- Option 2: Request the operator to provision a pvc and specify which models are requested
- Option 3: Provide pointers to an existing pvc and specify which models are requested


### Proposed specs

```yaml
kind: DoclingServe
spec:
  artifactsVolume:
    enable: false | true  # default false
    models: ""  # a space separated list like"layout tableformer code_formula picture_classifier smolvlm granite_vision easyocr"
    volumeClaimTemplates:  # optional
      - {}  # the specs for a volumeClaim (see below examples)
```

The default pvc settings should request for 16GB storage and `accessMode: ReadWriteMany`.

Inspired by other operators (e.g. [elasticsearch](https://www.elastic.co/docs/deploy-manage/deploy/cloud-on-k8s/volume-claim-templates)) Users could provide the `volumeClaim` details for advanced changes, for example:

- A different size or storage class:

    ```yaml
    volumeClaimTemplates:
      - metadata:
        name: docling-models
        spec:
          accessModes:
            - ReadWriteMany
          resources:
            requests:
              storage: 20Gi
          storageClassName: standard
    ```

- Point to an existing volume:

    ```yaml
    volumeClaimTemplates:
      - metadata:
        name: docling-models
        spec:
          storageClassName: ""
          volumeName: foo-pv    
    ```

### Changes in the Deployment

1. Mount the volume, e.g. in `/opt/docling/models` (just a proposal)
2. Set the ENV `DOCLING_SERVER_ARTIFACTS_PATH=/opt/docling/models` (the actual mount point)
3. Add a `initContainers` using the same image but a different `command`:

    ```yaml
    initContainers:
      - name: docling-models-download
        image: <spec.apiServer.image>
        command: ['sh', '-c', "docling-tools models download -o <mountPoint> <spec.artifactsVolume.models>"]
    ```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow a persistence volume for model weights #47

Problem description

Proposed solution

Proposed specs

Changes in the Deployment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow a persistence volume for model weights #47

Description

Problem description

Proposed solution

Proposed specs

Changes in the Deployment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions