fix: add tenant ownership check to document download endpoint (CWE-862)#13834
Open
sebastiondev wants to merge 1 commit intoinfiniflow:mainfrom
Open
fix: add tenant ownership check to document download endpoint (CWE-862)#13834sebastiondev wants to merge 1 commit intoinfiniflow:mainfrom
sebastiondev wants to merge 1 commit intoinfiniflow:mainfrom
Conversation
The /documents/<document_id> endpoint (download_doc) validated that the API token existed but never verified that the token holder owned the requested document. Any user with a valid API token could download any document in the system by providing its document_id. Add an authorization check that verifies the document belongs to a dataset owned by the authenticated tenant, consistent with the existing /datasets/<dataset_id>/documents/<document_id> download endpoint.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #13834 +/- ##
=======================================
Coverage 96.72% 96.72%
=======================================
Files 10 10
Lines 702 703 +1
Branches 112 112
=======================================
+ Hits 679 680 +1
Misses 5 5
Partials 18 18 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Vulnerability Summary
CWE-862: Missing Authorization — IDOR on document download endpoint
Severity: High
OWASP Classification: API1:2023 Broken Object Level Authorization (BOLA)
Affected Endpoint
GET /api/v1/documents/<document_id>— thedownload_doc()function inapi/apps/sdk/doc.pyData Flow
betaAPI token via theAuthorizationheaderAPIToken.query(beta=token)) — authentication ✓document_id(DocumentService.query(id=document_id)) — existence check onlySTORAGE_IMPL.get()Any user holding a valid
betatoken can download any document belonging to any tenant on the same RAGflow instance by specifying an arbitrarydocument_id.How
betaTokens Are Exposedbetatokens appear in iframesrcURLs as?auth=<token>in shared/embedded chatbot contexts (seeuseGetDocumentUrlinweb/src/hooks/use-document-request.ts). End-users of shared chatbots have access to the token value through browser DevTools.How Document IDs Are Discoverable
document_idfields in chunk reference metadataFix Description
File changed:
api/apps/sdk/doc.py(3 lines added)Change: After authenticating the caller and retrieving the document, the fix verifies that the document's parent knowledgebase (
doc[0].kb_id) belongs to the caller's tenant (tenant_id) by queryingKnowledgebaseService.query(id=doc[0].kb_id, tenant_id=tenant_id). If the check fails, an error is returned before any file content is served.Rationale
This fix aligns
download_doc()with the authorization pattern already used by the sister endpointdownload()at/api/v1/datasets/<dataset_id>/documents/<document_id>, which is properly protected with@token_requiredandKnowledgebaseService.query(id=dataset_id, tenant_id=tenant_id). TheKnowledgebaseServiceis already imported in this file and used throughout for authorization checks.Diff
Test Results Summary
KnowledgebaseService.query()returns a truthy result for valid tenant-document relationships.betatoken for Tenant A attempting to download a document belonging to Tenant B are now correctly rejected with"You do not have access to this document."download()endpoint at/datasets/<dataset_id>/documents/<document_id>already has this pattern and is unaffected.Disprove Analysis Results
We systematically attempted to disprove this finding through 9 checks:
✅ AUTH CHECK
The endpoint does have authentication (validates
betatoken). However, authentication ≠ authorization. The endpoint authenticates the caller but does not verify the caller owns the requested document. This is a genuine missing authorization issue.✅ NETWORK CHECK
No localhost-only restrictions. The application is deployed via Docker with ports exposed (
SVR_HTTP_PORT:9380). The nginx config serves^/(v1|api)routes publicly. This endpoint is internet-facing.✅ DEPLOYMENT CHECK
Docker Compose deployment with nginx reverse proxy. The
/api/v1/documents/<document_id>route is accessible through nginx'slocation ~ ^/(v1|api)rule. No VPN or service mesh constraints found.✅ CALLER TRACE
GET /api/v1/documents/<document_id>— does manualbetatoken auth (no@token_requireddecorator)useGetDocumentUrlgenerates this URL whenauthquery param is present (embedded/shared contexts)betatoken can call this endpoint with anydocument_id✅ VALIDATION CHECK
No input sanitization or authorization validation before the vulnerable call. The only check is
DocumentService.query(id=document_id)which verifies existence, not ownership.✅ PRIOR REPORTS
No prior reports of this specific IDOR. Related open issues: #6146 "Restricting Download Files" and #6090 "Enhanced Access Control & Security Features".
✅ SECURITY POLICY
SECURITY.mdexists but only documents a prior pickle deserialization vulnerability. No mention of IDOR or authorization policies.✅ RECENT COMMITS
Only one prior commit touched
doc.pyrecently:e705ac6 Add logout (#13796). No prior security fix for this endpoint.✅ FIX ADEQUACY
Three document download paths exist in the codebase:
GET /api/v1/datasets/<dataset_id>/documents/<document_id>(download()) — already protected with@token_required+KnowledgebaseService.query()GET /api/v1/documents/<document_id>(download_doc()) — ⬅ this is what the fix addressesGET /v1/document/get/<doc_id>(document_app.py:get()) — already protected with@login_requiredThe fix closes the unique path to download arbitrary document files via IDOR.
Exploit Sketch
Preconditions
betatoken (obtainable from any shared/embedded chatbot URL)document_id(obtainable from chat references or UUID1 guessing)Existing Mitigations (Insufficient)
betatokens are scoped separately from full API keys, but they still map to atenant_iddownload()is already properly protectedRelated Issues (Not in Scope of This Fix)
The same missing-authorization pattern exists in other
beta-authenticated endpoints (conversation_app.py:getsse(),session.py:chatbot_completions(),session.py:chatbots_inputs()), but those leak metadata rather than raw file content and are separate lower-impact issues.Verdict: CONFIRMED_VALID | Confidence: High
Recommendation: Merge this minimal fix; consider follow-up hardening of other
beta-authenticated endpoints.