Skip to content

[Bug]: Ingestion of URL with query parameters is not rendered or indexed correctly in the UI / OpenSearch #1645

@mpawlow

Description

@mpawlow

OpenRAG Version

0.5.0

Deployment Method

Local development (make dev)

Operating System

Ubuntu 24.04.4 LTS

Python Version

3.13.13

Affected Area

Ingestion (document processing, upload, Docling)

Bug Description

Ingestion of URL with query parameters is not rendered or indexed correctly in the UI / OpenSearch

Steps to Reproduce

  1. Go to Chat
  2. Enter prompt: "Ingest this URL: https://httpbin.org/get?project=openrag&page=1"
    • URL successfully ingested (lets call it X)
    • BUG: Knowledge page shows ingested document as "Untitled source"
    • If you do a Knowledge search for X content: 1-6a0d2067-3f69f59a6ddd2c160e4afe22, the result does show up in the search results
  3. Enter prompt: "Ingest this URL: https://httpbin.org/get?project=openrag&page=2"
    • URL successfully ingested (lets call it Y)
    • BUG: Knowledge page shows ingested document as "Untitled source"
    • If you do a Knowledge search for Y content: 1-6a0d2143-3cb67aab4f37500724ffcad0, the result does show up in the search results
    • BUG: Knowledge UI does not show both pages as distinct identities
      • The Untitled source name is used for both X and Y; but, only 1 entry shows up in the table

Expected Behavior

  • Treat ingested URLs with query parameters as a distinct resources
  • URL ingested documents should use better or unique names rather than Untitled source

Actual Behavior

  • Ingested URLs with query parameters are rendered as only 1 resource (in the UI)
  • URL ingested documents use the same default name: Untitled source

Relevant Logs

N/A

Screenshots

Image Image

Additional Context

N/A

Checklist

  • I have searched existing issues to ensure this bug hasn't been reported before.
  • I have provided all the requested information.

Metadata

Metadata

Assignees

Labels

bug🔴 Something isn't working.

Type

No fields configured for Bug.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions