Skip to content

fix(helm): add shared volume support for docreader artifacts#1778

Open
lordk911 wants to merge 3 commits into
Tencent:mainfrom
lordk911:fix/helm-docreader-shared-volume
Open

fix(helm): add shared volume support for docreader artifacts#1778
lordk911 wants to merge 3 commits into
Tencent:mainfrom
lordk911:fix/helm-docreader-shared-volume

Conversation

@lordk911

Copy link
Copy Markdown

Summary

This PR adds configurable shared volume support to the Helm chart, aligning it with the docker-compose.yml configuration.

Problem

The docker-compose.yml shares a docreader-tmp volume between:

  • docreader service (read-write at /tmp/docreader) — writes extracted images during document parsing
  • app service (read-only at /tmp/docreader) — reads images for display and VLM analysis

This shared volume configuration was missing from the Helm chart:

  • docreader.yaml had no volumes or volumeMounts at all
  • app.yaml only had data-files volume, no docreader-tmp mount

This caused document parsing with image extraction to fail in Kubernetes deployments using the Helm chart.

Solution

Added configurable docreader.sharedVolume support:

1. values.yaml

docreader:
  sharedVolume:
    enabled: true
    type: emptyDir  # or "pvc" for production multi-node
    sizeLimit: 5Gi  # for emptyDir
    pvcName: ""     # for PVC (auto-generated if empty)
    storageClass: ""
    size: 5Gi       # for PVC

2. docreader.yaml

Added volumeMount (/tmp/docreader, read-write) and volume definition.

3. app.yaml

Added volumeMount (/tmp/docreader, read-only) and volume definition.

Usage Examples

Single-node / Testing (emptyDir, default)

docreader:
  sharedVolume:
    type: emptyDir
    sizeLimit: 5Gi

Production / Multi-node (PVC with ReadWriteMany)

docreader:
  sharedVolume:
    type: pvc
    pvcName: weknora-docreader-tmp  # pre-created cephfs/NFS PVC

Or let the chart create the PVC:

docreader:
  sharedVolume:
    type: pvc
    storageClass: rook-cephfs  # or nfs, etc.
    size: 10Gi

Testing

  • ✅ Templates render correctly with both emptyDir and pvc configurations
  • ✅ Backward compatible: default emptyDir works for single-node testing
  • ✅ Production-ready: supports ReadWriteMany PVCs (cephfs, NFS, etc.)

Related

  • Aligns Helm chart with docker-compose.yml behavior
  • Fixes document parsing with image extraction in Kubernetes deployments

lordk911 pushed a commit to lordk911/WeKnora that referenced this pull request Jun 24, 2026
- Section 12.8: Mark docreader shared volume gap as resolved with PR link
- Section 12.9: Update prerequisite item 5 to reflect PVC creation
- Section 12.11: Update risk section to show issue is resolved
- Add PR Tencent#1778 to references

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Comment thread helm/values.yaml Outdated
Comment on lines +228 to +231
# -- Enable shared volume for docreader artifacts
enabled: true
# -- Volume type: "emptyDir" (single-node/testing) or "pvc" (multi-node/production)
# For production with multiple replicas, use "pvc" with ReadWriteMany (e.g., cephfs, NFS)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default emptyDir cannot share data between app and docreader, so default sharedVolume.enabled should be flase ?

Comment thread WeKnora部署讨论记录.md Outdated

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should be removed

Comment thread helm/values.yaml Outdated
# -- Size limit for emptyDir (only used when type=emptyDir)
sizeLimit: 5Gi
# -- PVC name (only used when type=pvc)
# If empty and type=pvc, a PVC will be created with name: <release-name>-docreader-tmp

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PVC auto-creation is documented but not implemented

shenk-b and others added 2 commits June 25, 2026 17:30
The docker-compose.yml configuration shares a 'docreader-tmp' volume
between docreader (read-write) and app (read-only) for document parsing
artifacts (extracted images, etc.). This was missing from the Helm chart.

Added configurable sharedVolume support:
- docreader.yaml: mount /tmp/docreader (read-write)
- app.yaml: mount /tmp/docreader (read-only)
- values.yaml: docreader.sharedVolume config with emptyDir (default)
  or PVC support for multi-node/production deployments

This aligns the Helm chart with docker-compose.yml behavior and enables
proper document parsing with image extraction in Kubernetes deployments.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
1. Default sharedVolume.enabled to false (backward compatible).
   emptyDir is pod-scoped and cannot share data between the app and
   docreader Deployments; enabling requires type=pvc with ReadWriteMany.

2. Implement PVC auto-creation (docreader-pvc.yaml template) that was
   documented but missing. Creates a PVC when type=pvc and pvcName is empty.

3. Add accessMode config (defaults to ReadWriteMany) for the auto-created PVC.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@lordk911 lordk911 force-pushed the fix/helm-docreader-shared-volume branch from 075d333 to aa5e26e Compare June 25, 2026 09:31
@lordk911

Copy link
Copy Markdown
Author

Thanks for the review @lyingbug! All three points addressed in the latest push:

1. Default sharedVolume.enabledfalse (helm/values.yaml)
You're right — emptyDir is pod-scoped and cannot share data between the app and docreader Deployments (separate pods). Changed the default to false for backward compatibility, and updated the comments to make this limitation explicit. Users who need cross-pod sharing must set enabled: true + type: pvc with a ReadWriteMany storage class.

2. Removed WeKnora部署讨论记录.md from this PR
Rebuilt the branch to contain only the Helm chart changes. That file was a personal deployment note and shouldn't be here.

3. Implemented PVC auto-creation (helm/templates/docreader-pvc.yaml)
Added the missing template that creates a PVC when type: pvc and pvcName is empty. Also added an accessMode config (defaults to ReadWriteMany) so users can control the access mode of the auto-created PVC.

The branch is now focused solely on the Helm chart shared-volume fix (2 commits). Please take another look when you have time.

Add a Docreader section to the README with the sharedVolume parameters,
explain the emptyDir limitation (pod-scoped, cannot share across the app
and docreader Deployments), and include a production example with cephfs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants