Skip to content

feat: Universal JWT support for Docling Serve — auto-forward Authorization header in DoclingRemoteComponent #1721

@coderabbitai

Description

@coderabbitai

Overview

OpenRAG already uses a universal JWT for all OpenSearch calls (SaaS/IBM auth mode). However, the Docling Serve Langflow component (flows/components/docling_remote.pyDoclingRemoteComponent) does not automatically forward the JWT as an Authorization: Bearer <token> header when making requests to Docling Serve.

Requested by @edwinjosechittilappilly in PR #1717 comment.


Current State

✅ Direct upload path — already handled

src/services/langflow_file_service.pysubmit_to_docling() already accepts a jwt_token parameter and forwards it as auth_header to upload_to_docling_direct_async():

task_id = await self.docling_service.upload_to_docling_direct_async(
    filename, content, user_id=owner, auth_header=jwt_token
)

❌ Langflow flow component — JWT not automatically injected

flows/components/docling_remote.pyDoclingRemoteComponent — currently relies on a manual api_headers TableInput for custom headers. There is no automatic injection of the JWT from the Langflow global variable system (i.e., X-Langflow-Global-Var-JWT) when the ingestion flow runs inside Langflow.

By contrast, flows/components/opensearch_multimodal.py already supports a dedicated JWT auth mode (with jwt_token, jwt_header, and bearer_prefix inputs) and reads the JWT from the global variable passed via X-Langflow-Global-Var-JWT.


Problem Statement

When the ingestion flow is executed via Langflow (rather than via the direct service path), the Docling Serve component does not authenticate with the Docling Serve instance using the user's JWT. This means:

  • Docling Serve deployed behind an auth proxy (e.g., IBM SaaS) will reject unauthenticated requests.
  • The JWT available as a Langflow global variable (JWT) is never forwarded to Docling Serve.

Proposed Solution

Update DoclingRemoteComponent in flows/components/docling_remote.py to:

  1. Add a dedicated JWT input (similar to opensearch_multimodal.py) that reads from the Langflow global variable JWT:

    • Add a StrInput for jwt_token (with global_field=True or wired to the global JWT variable) — marked advanced/optional.
    • When set, automatically prepend it as Authorization: Bearer <jwt_token> to the headers sent to Docling Serve.
  2. Merge with existing api_headers — the manual headers table should continue to work; the JWT should be merged in, with explicit headers taking precedence.

  3. Update the flow JSONs (flows/ingestion_flow.json, flows/openrag_url_mcp.json, flows/openrag_url_n.json) to wire the global JWT variable into the new Docling JWT input field.

Sketch

# In DoclingRemoteComponent._process_headers():
headers = {}
# 1. Inject JWT if provided
jwt = (getattr(self, 'jwt_token', None) or '').strip()
if jwt:
    bare = jwt.removeprefix('Bearer ').strip()
    headers['Authorization'] = f'Bearer {bare}'
# 2. Merge manual api_headers (take precedence)
headers.update(self._process_manual_headers())
return headers

Acceptance Criteria

  • DoclingRemoteComponent automatically sets Authorization: Bearer <jwt> when jwt_token is provided (via global var or explicit input).
  • Existing api_headers TableInput continues to work and takes precedence over the auto-injected JWT header.
  • flows/ingestion_flow.json wires the global JWT variable to the new JWT input on the Docling Serve component.
  • URL ingestion flow JSONs (openrag_url_mcp.json, openrag_url_n.json) are updated similarly.
  • Unit test added to validate that DoclingRemoteComponent._process_headers() correctly injects the Authorization header when jwt_token is set.

References

  • flows/components/docling_remote.py — Docling Serve Langflow component (needs update)
  • flows/components/opensearch_multimodal.py — reference implementation for JWT auth in a Langflow component
  • src/services/langflow_file_service.pysubmit_to_docling() — direct path already handles JWT
  • src/utils/langflow_headers.pybuild_ibm_opensearch_vars() — shows how JWT global vars are built
  • PR fix: Update ingestion flow configuration to support group acl #1717 — fix: Update ingestion flow configuration to support group ACL

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions