fix: braintrust masking was over truncating#5458
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Greptile Summary
This PR fixes a bug in the Braintrust tracing masking functionality where data was being over-truncated. The original implementation had hardcoded truncation parameters that truncated strings to approximately 1,000 characters (800 head + 200 tail) while claiming to truncate to 10,000 characters.
The fix introduces several improvements:
-
Configurable masking length: Adds a
MASKING_LENGTHconstant set to 20,000 characters, making the truncation limit explicit and easily adjustable -
Improved truncation logic: The new
_truncate_strfunction allocates 4/5 of the total length to the head (16,000 chars) and 1/5 to the tail (4,000 chars), providing much more context while maintaining the sandwich approach -
Better code organization: Extracts the masking decision into a separate
_should_maskfunction, improving readability and making the logic more maintainable -
Enhanced transparency: The truncation message now clearly shows the original length and target length, making it easier to understand what was truncated
This change fits into Onyx's broader evaluation and tracing infrastructure, specifically the Braintrust integration used for monitoring and debugging LLM interactions. The tracing system helps track performance and behavior across different spans (root, task, score, LLM, function, generic), and proper data preservation is crucial for effective debugging.
Confidence score: 4/5
- This PR is safe to merge with minimal risk as it fixes a clear bug without changing core functionality
- Score reflects a straightforward bug fix with improved constants and better code structure
- No files require special attention - the change is contained to a single utility function
1 file reviewed, 2 comments
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
There was a problem hiding this comment.
1 issue found across 1 file
Prompt for AI agents (all 1 issues)
Understand the root cause of the following 1 issues and fix them.
<file name="backend/onyx/evals/tracing.py">
<violation number="1" location="backend/onyx/evals/tracing.py:24">
Docstring claims span-aware masking that the function does not implement, which is misleading and can confuse users.</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.
backend/onyx/evals/tracing.py
Outdated
| if len(data_str) > 10_000: | ||
| return _truncate_str(data_str) | ||
| return data | ||
| """Mask data based on span type. Only mask generic and function spans, not root, task, score, or LLM spans.""" |
There was a problem hiding this comment.
Docstring claims span-aware masking that the function does not implement, which is misleading and can confuse users.
Prompt for AI agents
Address the following comment on backend/onyx/evals/tracing.py at line 24:
<comment>Docstring claims span-aware masking that the function does not implement, which is misleading and can confuse users.</comment>
<file context>
@@ -7,18 +7,24 @@
- if len(data_str) > 10_000:
- return _truncate_str(data_str)
- return data
+ """Mask data based on span type. Only mask generic and function spans, not root, task, score, or LLM spans."""
+ if not _should_mask(data):
+ return data
</file context>
| """Mask data based on span type. Only mask generic and function spans, not root, task, score, or LLM spans.""" | |
| """Mask data by length; mask only when len(str(data)) exceeds MASKING_LENGTH.""" |
|
3 Jobs Failed: Run Integration Tests v2 / integration-tests (tests/connector, tests-connector) failed on "Run Integration Tests for tests-connector"Run Integration Tests v2 / required failed on "Run actions/github-script@v7"1 job failed running on non-Blacksmith runners. Summary: 6 successful workflows, 2 failed workflows
Last updated: 2025-09-19 20:40:42 UTC |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Description
was masking to 1000 characters (too aggressive)
How Has This Been Tested?
tested locally
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.
Summary by cubic
Fixes over-aggressive Braintrust tracing masking to preserve useful context while still limiting very large payloads. Increases the mask threshold to 20,000 characters and applies masking only when needed.