Ragflow resource consumption in cluster #12096

alspires · 2025-12-22T12:58:52Z

alspires
Dec 22, 2025

Hello,

I'm creating this Q&A discussion just to be extra sure that this is expected. We have a application pool in our Kubernetes cluster where Ragflow is also deployed. After it was deployed though, we've been getting node exhaustion errors because Ragflow seems to be consuming almost all resources we have available.

Is this expected? Does it look right?

Thanks.

alspires · 2025-12-22T13:02:40Z

dosubot[bot]
bot Dec 22, 2025

Yes, Ragflow's high resource consumption in Kubernetes is expected—especially if you haven't set explicit resource requests or limits in your deployment. By default, the Helm chart leaves the resource section empty, so Ragflow pods can consume all available node resources, which often leads to node exhaustion in shared clusters. This is due to Ragflow's architecture: it loads heavy dependencies (like PyTorch, numpy, and LLM providers) and uses resource-intensive document parsing, especially with advanced parsers like "deepdoc" or "mineru" and large PDFs. High idle memory usage is also common because models and libraries are preloaded and kept in memory even when not actively processing tasks [source].

To prevent node exhaustion, set resource requests and limits for the Ragflow pod in your Helm values.yaml under ragflow.deployment.resources. For example:

ragflow:
  deployment:
    resources:
      requests:
        cpu: "4"
        memory: "16Gi"
      limits:
        cpu: "8"
        memory: "32Gi"

Other ways to reduce resource usage: use the "slim" Docker image if you don't need local embedding models, tune environment variables like DOC_BULK_SIZE, EMBEDDING_BATCH_SIZE, and MAX_CONCURRENT_TASKS to lower concurrency and batch sizes, and disable non-essential features (like reranker, auto-keyword, or RAPTOR) if you don't need them [source].

If you see memory usage continually growing, especially after document processing, you might be hitting a known memory leak—try the latest nightly build, and as a workaround, set up periodic restarts with health checks to avoid OOM crashes [source].

For production, maintainers recommend at least 64GB RAM and more than 16GB GPU memory for stable operation under load [source].

Let me know if you want more details on tuning or specific config examples!

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

2 replies

alspires Dec 22, 2025
Author

@dosu We already had request limits in place, and we'd get node exhaustion errors anyways:

ragflowServicesResources:
  requests:
    cpu: "3150m"
    memory: "7400M"
    ephemeralStorage: "1Gi"
  limits:
    memory: "7400M"
    ephemeralStorage: "3Gi"

We are cloning Ragflow's source code and building its Dockerfile in our CI/CD pipeline, which produces a 5,970.4 MB image that is then used on our Kubernetes cluster.

dosubot[bot] bot Dec 22, 2025

Your resource limits (7.4Gi memory, ~3 CPU) are on the low side for Ragflow, especially if you're processing large or complex documents, using advanced parsers, or running with default concurrency settings. Even with limits set, Ragflow can still cause node exhaustion if:

The workload spikes (e.g., multiple PDFs parsed at once), which can quickly exceed your limits and trigger OOM kills or node pressure.
The pod spawns subprocesses or uses features (like deepdoc/mineru parsing) that allocate memory outside the main process, sometimes bypassing cgroup limits if the container runtime or node cgroups are misconfigured [source].
Other pods on the same node are also using significant resources, so your node runs out of allocatable memory before the pod hits its own limit.

A few things to try:

Monitor with kubectl top pod and kubectl top node to see which resource is exhausted first.
Tune Ragflow's concurrency: set MAX_CONCURRENT_TASKS=2 and MAX_CONCURRENT_CHUNK_BUILDERS=1 as environment variables to reduce parallelism and memory spikes [source].
Use the "slim" Docker image if you don't need local embedding models—this can cut memory usage significantly [source].
If you're on an older Ragflow version, upgrade to the latest, as several memory leaks and resource spikes have been fixed recently [source].
Double-check your node's vm.max_map_count (should be at least 262144) and ensure your container runtime enforces cgroup limits correctly [source].

If you still see node exhaustion, consider raising your pod limits (16Gi+ is typical for production Ragflow) and/or scaling out to more nodes. Let me know if you want help with specific YAML or environment variable examples.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Magicbook1108 · 2025-12-23T02:25:46Z

Magicbook1108
Dec 23, 2025
Collaborator

ragflowServicesResources:
requests:
cpu: "3150m"
memory: "7400M"
ephemeralStorage: "1Gi"
limits:
memory: "7400M"
ephemeralStorage: "3Gi"

I think these resource limits are too tight for RAGFlow. CPU and memory will be insufficient

2 replies

alspires Dec 23, 2025
Author

Are the recommendations in dosubot's answer valid? We have a shared cluster where our Ragflow pod is in and we're looking into increasing our resources in order to prevent the node exhaustion errors, but we need to be sure that this will solve the issue. Thanks for replying by the way.

Magicbook1108 Dec 24, 2025
Collaborator

I haven’t done this myself, so I don’t know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InfiniFlow

Ragflow resource consumption in cluster #12096

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

InfiniFlow

Ragflow resource consumption in cluster #12096

Uh oh!

Uh oh!

alspires Dec 22, 2025

Replies: 2 comments · 4 replies

Uh oh!

dosubot[bot] bot Dec 22, 2025

Uh oh!

Uh oh!

alspires Dec 22, 2025 Author

Uh oh!

dosubot[bot] bot Dec 22, 2025

Uh oh!

Magicbook1108 Dec 23, 2025 Collaborator

Uh oh!

alspires Dec 23, 2025 Author

Uh oh!

Magicbook1108 Dec 24, 2025 Collaborator

alspires
Dec 22, 2025

Replies: 2 comments 4 replies

dosubot[bot]
bot Dec 22, 2025

alspires Dec 22, 2025
Author

Magicbook1108
Dec 23, 2025
Collaborator

alspires Dec 23, 2025
Author

Magicbook1108 Dec 24, 2025
Collaborator