Merged
Conversation
added 3 commits
December 17, 2025 14:22
Use 'or ""' instead of default param to handle explicit None values
…simultaneously - Add staggered startup delay (2.0s × worker_num) to spread connection attempts - Reduce ConnectionPool max_size from 32 to 4 to stay within Infinity connection limit
When MinerU returns blocks with type=DISCARDED, the match/case falls through with 'pass' but still tries to use the 'section' variable which may be unbound (if DISCARDED is the first block) or stale from a previous iteration. Change 'pass' to 'continue' to skip the entire block processing for discarded content, preventing both UnboundLocalError and duplicate section entries.
8017f1e to
fcfc53c
Compare
yongtenglei
approved these changes
Dec 18, 2025
clifftseng
pushed a commit
to clifftseng/ragflow
that referenced
this pull request
Feb 9, 2026
### What problem does this PR solve? **Fixes infiniflow#8706** - `InfinityException: TOO_MANY_CONNECTIONS` when running multiple task executor workers ### Problem Description When running RAGFlow with 8-16 task executor workers, most workers fail to start properly. Checking logs revealed that workers were stuck/hanging during Infinity connection initialization - only 1-2 workers would successfully register in Redis while the rest remained blocked. ### Root Cause The Infinity SDK `ConnectionPool` pre-allocates all connections in `__init__`. With the default `max_size=32` and multiple workers (e.g., 16), this creates 16×32=512 connections immediately on startup, exceeding Infinity's default 128 connection limit. Workers hang while waiting for connections that can never be established. ### Changes 1. **Prevent Infinity connection storm** (`rag/utils/infinity_conn.py`, `rag/svr/task_executor.py`) - Reduced ConnectionPool `max_size` from 32 to 4 (sufficient since operations are synchronous) - Added staggered startup delay (2s per worker) to spread connection initialization 2. **Handle None children_delimiter** (`rag/app/naive.py`) - Use `or ""` to handle explicitly set None values from parser config 3. **MinerU parser robustness** (`deepdoc/parser/mineru_parser.py`) - Use `.get()` for optional output fields that may be missing - Fix DISCARDED block handling: change `pass` to `continue` to skip discarded blocks entirely ### Why `max_size=4` is sufficient | Workers | Pool Size | Total Connections | Infinity Limit | |---------|-----------|-------------------|----------------| | 16 | 32 | 512 | 128 ❌ | | 16 | 4 | 64 | 128 ✅ | | 32 | 4 | 128 | 128 ✅ | - All RAGFlow operations are synchronous: `get_conn()` → operation → `release_conn()` - No parallel `docStoreConn` operations in the codebase - Maximum 1-2 concurrent connections needed per worker; 4 provides safety margin ### MinerU DISCARDED block bug When MinerU returns blocks with `type: "discarded"` (headers, footers, watermarks, page numbers, artifacts), the previous code used `pass` which left the `section` variable undefined, causing: - **UnboundLocalError** if DISCARDED is the first block - **Duplicate content** if DISCARDED follows another block (stale value from previous iteration) **Root cause confirmed via MinerU source code:** From [`mineru/utils/enum_class.py`](https://github.com/opendatalab/MinerU/blob/main/mineru/utils/enum_class.py#L14): ```python class BlockType: DISCARDED = 'discarded' # VLM 2.5+ also has: HEADER, FOOTER, PAGE_NUMBER, ASIDE_TEXT, PAGE_FOOTNOTE ``` Per [MinerU documentation](https://opendatalab.github.io/MinerU/reference/output_files/), discarded blocks contain content that should be filtered out for clean text extraction. **Fix:** Changed `pass` to `continue` to skip discarded blocks entirely. ### Testing - Verified all 16 workers now register successfully in Redis - All workers heartbeating correctly - Document parsing works as expected - MinerU parsing with DISCARDED blocks no longer crashes ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: user210 <user210@rt>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Fixes #8706 -
InfinityException: TOO_MANY_CONNECTIONSwhen running multiple task executor workersProblem Description
When running RAGFlow with 8-16 task executor workers, most workers fail to start properly. Checking logs revealed that workers were stuck/hanging during Infinity connection initialization - only 1-2 workers would successfully register in Redis while the rest remained blocked.
Root Cause
The Infinity SDK
ConnectionPoolpre-allocates all connections in__init__. With the defaultmax_size=32and multiple workers (e.g., 16), this creates 16×32=512 connections immediately on startup, exceeding Infinity's default 128 connection limit. Workers hang while waiting for connections that can never be established.Changes
Prevent Infinity connection storm (
rag/utils/infinity_conn.py,rag/svr/task_executor.py)max_sizefrom 32 to 4 (sufficient since operations are synchronous)Handle None children_delimiter (
rag/app/naive.py)or ""to handle explicitly set None values from parser configMinerU parser robustness (
deepdoc/parser/mineru_parser.py).get()for optional output fields that may be missingpasstocontinueto skip discarded blocks entirelyWhy
max_size=4is sufficientget_conn()→ operation →release_conn()docStoreConnoperations in the codebaseMinerU DISCARDED block bug
When MinerU returns blocks with
type: "discarded"(headers, footers, watermarks, page numbers, artifacts), the previous code usedpasswhich left thesectionvariable undefined, causing:Root cause confirmed via MinerU source code:
From
mineru/utils/enum_class.py:Per MinerU documentation, discarded blocks contain content that should be filtered out for clean text extraction.
Fix: Changed
passtocontinueto skip discarded blocks entirely.Testing
Type of change