fix: braintrust masking was over truncating by rguan72 · Pull Request #5458 · onyx-dot-app/onyx

rguan72 · 2025-09-19T17:59:22Z

Description

was masking to 1000 characters (too aggressive)

How Has This Been Tested?

tested locally

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

This PR should be backported (make sure to check that the backport attempt succeeds)
[Optional] Override Linear Check

Summary by cubic

Fixes over-aggressive Braintrust tracing masking to preserve useful context while still limiting very large payloads. Increases the mask threshold to 20,000 characters and applies masking only when needed.

Bug Fixes
- Raised masking length to 20,000 chars with an 80/20 head/tail split.
- Added a guard to skip masking for payloads under the threshold.
- Standardized the truncation marker to include original length and target size.

vercel · 2025-09-19T17:59:27Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
internal-search	Ready	Preview	Comment	Sep 19, 2025 8:06pm

greptile-apps

Greptile Summary

This PR fixes a bug in the Braintrust tracing masking functionality where data was being over-truncated. The original implementation had hardcoded truncation parameters that truncated strings to approximately 1,000 characters (800 head + 200 tail) while claiming to truncate to 10,000 characters.

The fix introduces several improvements:

Configurable masking length: Adds a MASKING_LENGTH constant set to 20,000 characters, making the truncation limit explicit and easily adjustable
Improved truncation logic: The new _truncate_str function allocates 4/5 of the total length to the head (16,000 chars) and 1/5 to the tail (4,000 chars), providing much more context while maintaining the sandwich approach
Better code organization: Extracts the masking decision into a separate _should_mask function, improving readability and making the logic more maintainable
Enhanced transparency: The truncation message now clearly shows the original length and target length, making it easier to understand what was truncated

This change fits into Onyx's broader evaluation and tracing infrastructure, specifically the Braintrust integration used for monitoring and debugging LLM interactions. The tracing system helps track performance and behavior across different spans (root, task, score, LLM, function, generic), and proper data preservation is crucial for effective debugging.

Confidence score: 4/5

This PR is safe to merge with minimal risk as it fixes a clear bug without changing core functionality
Score reflects a straightforward bug fix with improved constants and better code structure
No files require special attention - the change is contained to a single utility function

_{1 file reviewed, 2 comments}

_{Edit Code Review Bot Settings | Greptile}

backend/onyx/evals/tracing.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

cubic-dev-ai

1 issue found across 1 file

Prompt for AI agents (all 1 issues)


Understand the root cause of the following 1 issues and fix them.


<file name="backend/onyx/evals/tracing.py">

<violation number="1" location="backend/onyx/evals/tracing.py:24">
Docstring claims span-aware masking that the function does not implement, which is misleading and can confuse users.</violation>
</file>

_{React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.}

cubic-dev-ai · 2025-09-19T18:06:04Z

backend/onyx/evals/tracing.py

-    if len(data_str) > 10_000:
-        return _truncate_str(data_str)
-    return data
+    """Mask data based on span type. Only mask generic and function spans, not root, task, score, or LLM spans."""


Docstring claims span-aware masking that the function does not implement, which is misleading and can confuse users.

Prompt for AI agents

Address the following comment on backend/onyx/evals/tracing.py at line 24: <comment>Docstring claims span-aware masking that the function does not implement, which is misleading and can confuse users.</comment> <file context> @@ -7,18 +7,24 @@ - if len(data_str) > 10_000: - return _truncate_str(data_str) - return data + """Mask data based on span type. Only mask generic and function spans, not root, task, score, or LLM spans.""" + if not _should_mask(data): + return data </file context>

Suggested change

"""Mask data based on span type. Only mask generic and function spans, not root, task, score, or LLM spans."""

"""Mask data by length; mask only when len(str(data)) exceeds MASKING_LENGTH."""

blacksmith-sh · 2025-09-19T18:44:04Z

3 Jobs Failed:

Run Integration Tests v2 / integration-tests (tests/connector, tests-connector) failed on "Run Integration Tests for tests-connector"

[...]
------------------------------ Captured log setup ------------------------------
INFO     onyx.utils.logger:reset.py:414 Resetting Postgres...
INFO     onyx.utils.logger:reset.py:260 Downgrading Postgres... (1/10)
INFO     onyx.utils.logger:reset.py:286 Upgrading Postgres...
INFO     onyx.utils.logger:reset.py:289 Setting up Postgres...
NOTICE   onyx.utils.logger:setup.py:289 Verifying default connector/credential exist.
NOTICE   onyx.utils.logger:setup.py:295 Loading input prompts and user folders
INFO     onyx.utils.logger:reset.py:416 Resetting Vespa...
NOTICE   onyx.utils.logger:swap_index.py:85 Vespa index swap (attempt 1/10)...
NOTICE   onyx.utils.logger:index.py:239 Deploying Vespa application package to http://index:19071/application/v2/tenant/default/prepareandactivate
NOTICE   onyx.utils.logger:swap_index.py:96 Vespa index swap complete.
NOTICE   onyx.utils.logger:setup.py:258 Setting up Vespa (attempt 1/10)...
NOTICE   onyx.utils.logger:index.py:239 Deploying Vespa application package to http://index:19071/application/v2/tenant/default/prepareandactivate
NOTICE   onyx.utils.logger:setup.py:274 Vespa setup complete.
INFO     onyx.utils.logger:reset.py:418 Resetting FileStore...
=========================== short test summary info ============================
FAILED tests/integration/tests/connector/test_connector_creation.py::test_connector_pause_while_indexing
=================== 1 failed, 4 passed in 300.86s (0:05:00) ====================
Error: Final attempt failed. Child_process exited with error code 1

Run Integration Tests v2 / required failed on "Run actions/github-script@v7"

[...]
  retry-exempt-status-codes: 400,401,403,404,422
env:
  PRIVATE_REGISTRY: experimental-registry.blacksmith.sh:5000
  PRIVATE_REGISTRY_USERNAME: ***
  PRIVATE_REGISTRY_PASSWORD: ***
  OPENAI_API_KEY: ***
  SLACK_BOT_TOKEN: ***
  CONFLUENCE_TEST_SPACE_URL: ***
  CONFLUENCE_USER_NAME: ***
  CONFLUENCE_ACCESS_TOKEN: ***
  JIRA_BASE_URL: ***
  JIRA_USER_EMAIL: ***
  JIRA_API_TOKEN: ***
  PERM_SYNC_SHAREPOINT_CLIENT_ID: ***
  PERM_SYNC_SHAREPOINT_PRIVATE_KEY: ***
  PERM_SYNC_SHAREPOINT_CERTIFICATE_PASSWORD: ***
  PERM_SYNC_SHAREPOINT_DIRECTORY_ID: ***
  GITHUB_REPO_NAME: ***/onyx
Error: One or more upstream jobs failed or were cancelled.

1 job failed running on non-Blacksmith runners.

Summary: 6 successful workflows, 2 failed workflows

✅ Helm - Lint and Test Charts (1 job succeeded)
✅ Python Unit Tests (1 job succeeded)
✅ External Dependency Unit Tests (5 jobs succeeded)
✅ Ensure PR references Linear (1 job succeeded)
❌ Run Integration Tests v2 (34 jobs succeeded, 2 jobs failed)
✅ Quality Checks PR (1 job succeeded)
✅ Python Checks (1 job succeeded)
❌ Connector Tests (1 job failed)

Last updated: 2025-09-19 20:40:42 UTC

backend/onyx/evals/tracing.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

.

5aab51e

rguan72 requested a review from a team as a code owner September 19, 2025 17:59

rguan72 requested a review from evan-onyx September 19, 2025 17:59

greptile-apps bot reviewed Sep 19, 2025

View reviewed changes

backend/onyx/evals/tracing.py Show resolved Hide resolved

backend/onyx/evals/tracing.py Outdated Show resolved Hide resolved

Update backend/onyx/evals/tracing.py

09381a4

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

vercel bot deployed to Preview September 19, 2025 18:05 View deployment

cubic-dev-ai bot reviewed Sep 19, 2025

View reviewed changes

evan-onyx approved these changes Sep 19, 2025

View reviewed changes

backend/onyx/evals/tracing.py Show resolved Hide resolved

backend/onyx/evals/tracing.py Outdated Show resolved Hide resolved

rguan72 added 2 commits September 19, 2025 13:01

.

72a1b49

.

cfc10f5

vercel bot deployed to Preview September 19, 2025 20:06 View deployment

rguan72 merged commit 6094f70 into main Sep 19, 2025
50 of 54 checks passed

rguan72 deleted the richard/fix-braintrust-tracing-data-mask branch September 19, 2025 21:10

razvanMiu pushed a commit to eea/danswer that referenced this pull request Oct 16, 2025

fix: braintrust masking was over truncating (onyx-dot-app#5458)

9f6136f

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: braintrust masking was over truncating#5458

fix: braintrust masking was over truncating#5458
rguan72 merged 4 commits intomainfrom
richard/fix-braintrust-tracing-data-mask

rguan72 commented Sep 19, 2025 •

edited

Loading

Uh oh!

vercel bot commented Sep 19, 2025 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

cubic-dev-ai bot Sep 19, 2025 •

edited

Loading

Uh oh!

blacksmith-sh bot commented Sep 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	"""Mask data based on span type. Only mask generic and function spans, not root, task, score, or LLM spans."""
	"""Mask data by length; mask only when len(str(data)) exceeds MASKING_LENGTH."""

Conversation

rguan72 commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Backporting (check the box to trigger backport action)

Summary by cubic

Uh oh!

vercel bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Summary

Confidence score: 4/5

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

blacksmith-sh bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rguan72 commented Sep 19, 2025 •

edited

Loading

vercel bot commented Sep 19, 2025 •

edited

Loading

cubic-dev-ai bot Sep 19, 2025 •

edited

Loading

blacksmith-sh bot commented Sep 19, 2025 •

edited

Loading