Skip to content

pass through various id's and log them in the model server for better… #4485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 10, 2025

Conversation

rkuo-danswer
Copy link
Contributor

… tracking

Description

Fixes DAN-1785.
https://linear.app/danswer/issue/DAN-1785/create-request-ids-during-indexing-and-handle-in-model-server

How Has This Been Tested?

[Describe the tests you ran to verify your changes]

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Copy link

vercel bot commented Apr 9, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
internal-search ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 9, 2025 6:23pm

@rkuo-danswer rkuo-danswer marked this pull request as ready for review April 9, 2025 01:17
@rkuo-danswer rkuo-danswer requested a review from a team as a code owner April 9, 2025 01:17
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR introduces enhanced traceability by passing unique request and tenant IDs throughout the indexing and model server flow, with structured IDs added for better correlation across logs and metrics.

  • backend/onyx/background/indexing/run_indexing.py: Adds attempt_id, request_id, and structured_id to IndexAttemptMetadata and generates new IDs per batch.
  • backend/ee/onyx/server/middleware/tenant_tracking.py: Implements multi-method tenant ID extraction but has a logic flaw in _get_tenant_id_from_request’s finally block.
  • backend/onyx/indexing/indexing_pipeline.py: Passes tenant_id and request_id when invoking embed_chunks_with_failure_handling.
  • backend/onyx/indexing/embedder.py: Updates embed_chunks signature to accept tenant_id/request_id; note fallback title embedding lacks tenant_id.
  • Additional updates (middleware, chunker, models): Ensure consistent tenant/request ID propagation to support improved logging and observability.

10 file(s) reviewed, 1 comment(s)
Edit PR Review Bot Settings | Greptile

@rkuo-danswer rkuo-danswer added this pull request to the merge queue Apr 10, 2025
Merged via the queue into main with commit 3fc8027 Apr 10, 2025
10 of 11 checks passed
@rkuo-danswer rkuo-danswer deleted the feature/tenant-to-embedding-server branch April 10, 2025 01:21
tim-dim pushed a commit to grantgpteu/grantgpt-dev that referenced this pull request Apr 10, 2025
onyx-dot-app#4485)

* pass through various id's and log them in the model server for better tracking

* fix test

---------

Co-authored-by: Richard Kuo (Onyx) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants