Skip to content

feat: token-aware output truncation for large tool responses#26

Merged
siddhant3030 merged 3 commits into
mainfrom
feat/token-aware-truncation
Jun 8, 2026
Merged

feat: token-aware output truncation for large tool responses#26
siddhant3030 merged 3 commits into
mainfrom
feat/token-aware-truncation

Conversation

@siddhant3030

Copy link
Copy Markdown
Collaborator

Summary

  • Adds src/dalgo_mcp/truncate.py with truncate_log_text() and truncate_list()
  • Applies log truncation to dalgo_get_flow_run_logs (head+tail, 100 lines default)
  • Truncated responses include _meta with total/shown/omitted counts and a note

Why

Pipeline logs can be megabytes. Without truncation, a single dalgo_get_flow_run_logs call can consume most of the context window, leaving no room for the model to reason about the result.

Test plan

  • dalgo_get_flow_run_logs on a large log returns max 100 lines
  • _meta.truncated=true and _meta.omitted count present when truncated
  • Short logs (< 100 lines) returned in full with _meta.truncated=false
  • truncate_log_text("") handles empty input gracefully

Closes #19
🤖 Generated with Claude Code

Adds truncate.py with truncate_log_text() and truncate_list() helpers.
Applies log truncation to dalgo_get_flow_run_logs. Closes #19.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@siddhant3030

Copy link
Copy Markdown
Collaborator Author

@claude can we review this PR?

@claude

claude Bot commented May 27, 2026

Copy link
Copy Markdown

Claude finished @siddhant3030's task in 1m 43s —— View job


Claude PR Review

  • Read changed files (truncate.py, pipelines.py)
  • Inspect surrounding context
  • Analyze correctness, edge cases, naming, and consistency
  • Post review findings

Nice, well-scoped change. The head+tail strategy is the right call for logs, and emitting _meta so the model can reason about omitted content is a good shape. A few correctness issues and some inconsistencies are worth tightening before merge.


🐛 Bugs / correctness

1. max_lines=1 (or any case where half == 0) doesn't truncate the tail. src/dalgo_mcp/truncate.py:25

half = max_lines // 2          # 0 when max_lines == 1
tail = lines[-half:]           # lines[-0:] == lines[0:] == ALL lines

In Python, -0 == 0, so lines[-0:] returns the entire list. With max_lines=1 and a 1000-line input, tail becomes all 1000 lines — opposite of the intent. Unlikely in production (default is 100), but it's a footgun if a caller ever passes a small max_lines. Guard with tail = lines[-half:] if half else [].

2. Off-by-one when max_lines is odd. src/dalgo_mcp/truncate.py:23-39

For max_lines=99, half=49, so actual shown lines = 2 * 49 = 98, but _meta reports "shown": 99 and "omitted": len(lines) - 99. The note string ("first 49 and last 49") is correct, but the numeric fields disagree with it. Fix: compute shown = 2 * half and omitted = len(lines) - shown, or force max_lines to be even up front.

3. _meta shape varies between code paths. Empty input returns {"truncated": False} with no total_lines; the short-but-non-empty path returns {"truncated": False, "total_lines": N}. Inconsistent — set total_lines: 0 in the empty branch so callers can rely on the key always being present. src/dalgo_mcp/truncate.py:17

4. Broad except Exception: pass silently defeats the feature. src/dalgo_mcp/tools/pipelines.py:127-129

except Exception:
    pass
return format_response(resp)

If anything goes wrong inside the truncation block (e.g., MemoryError building truncated_text from a multi-MB string, an unexpected JSON shape, etc.), we fall through to format_response(resp) and return the raw, untruncated payload — exactly the context-flooding scenario this PR is supposed to prevent. At minimum, log the exception so failures are debuggable; better, narrow the except to expected types (ValueError, KeyError, TypeError).

5. dict-branch can clobber an upstream _meta field and silently double-stringifies lists. src/dalgo_mcp/tools/pipelines.py:122-126

elif isinstance(data, dict) and "logs" in data:
    result = truncate_log_text(str(data["logs"]))
    data["logs"] = result["content"]
    data["_meta"] = result["_meta"]
  • If data already has a _meta key from the upstream API, you overwrite it. Use a different namespace (e.g., _truncation) or merge.
  • str(data["logs"]) produces Python repr() if data["logs"] is a list, e.g., "['line1', 'line2']" — not the joined log text. The list branch above does "\n".join(...) for exactly this reason; the dict branch should do the same when data["logs"] is a list.

🤔 Design / framing

6. "Token-aware" vs. line-aware. The PR title and module docstring say "token-aware," but the implementation counts lines, not tokens. A 100-line log of stack traces can still be tens of thousands of tokens. Two options:

  • Rename to "output truncation" / "line-based truncation" to match what it actually does, or
  • Add a per-line character cap (or a total-bytes cap) so a pathological single huge line can't blow the budget. A simple max_chars parameter alongside max_lines would cover most real cases without pulling in tiktoken.

7. truncate_list is unused. Dead code at merge time. Either:

  • Apply it to a list-returning tool in this PR (e.g., dalgo_get_pipeline_run_history, dalgo_list_pipelines), or
  • Hold it back for the follow-up PR that actually uses it. Adding unused helpers makes the surface area harder to evolve.

8. Inconsistent response shape across input types. dalgo_get_flow_run_logs returns three different JSON shapes depending on whether the upstream payload is a str, list, or dict:

  • str/list path → {"content": ..., "_meta": ...}
  • dict path → original dict with logs overwritten and _meta added

Models calling this tool will have to handle both. Consider normalizing to one shape (e.g., always {"logs": ..., "_meta": ..., ...rest}).


🧹 Nits

  • src/dalgo_mcp/tools/pipelines.py:105-106import json and from dalgo_mcp.truncate import truncate_log_text are inside the function body. The rest of the file uses top-of-file imports; move these up for consistency.
  • The note text "Full logs available in Dalgo UI." is helpful but assumes the model/user knows what "Dalgo UI" means and how to get there. If there's a deep-link pattern (e.g., <base>/runs/<id>), surfacing the URL would be more actionable.
  • Test plan checkboxes in the PR body are unchecked, and the repo has no test suite yet (per CLAUDE.md). This module is unusually test-friendly (pure functions, no I/O) — a small tests/test_truncate.py would lock in the edge cases above and would be the natural first test file for the repo.

✅ What's good

  • Head+tail strategy preserves the most useful parts of a log (start of run + error tail).
  • _meta includes total_lines, shown, omitted, and a human-readable note — exactly what a model needs to decide whether to fetch more.
  • Tool docstring at pipelines.py:99-100 calls out the truncation behavior so it's discoverable.
  • Only applied to the one tool that demonstrably needs it — no over-application.

If you want, I can push fixes for items 1–5 (the bugs) and item 7 (pull truncate_list out until it's used) in a follow-up commit on this branch — just say the word.
• Branch: feat/token-aware-truncation

Resolves conflict in src/dalgo_mcp/tools/pipelines.py by combining:
- adapt_context() pattern from PR #33 (replaces register(app, get_client) signature)
- Token-aware log truncation in dalgo_get_flow_run_logs from this PR

Both sets of changes are preserved: all tool files now use adapt_context()
and dalgo_get_flow_run_logs still truncates large log responses using
truncate_log_text() from src/dalgo_mcp/truncate.py.

Also applies ruff formatting fixes to context.py, docs.py, and truncate.py.

https://claude.ai/code/session_011MeXGRNe15ksjbwsDM8ULN
@siddhant3030 siddhant3030 force-pushed the feat/token-aware-truncation branch from 74ebf36 to 034d838 Compare May 28, 2026 10:44
…cation

# Conflicts:
#	src/dalgo_mcp/server.py
#	src/dalgo_mcp/tools/pipelines.py
@siddhant3030 siddhant3030 merged commit fbbdaff into main Jun 8, 2026
0 of 3 checks passed
@siddhant3030 siddhant3030 deleted the feat/token-aware-truncation branch June 8, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Token-aware output truncation for large tool responses

2 participants