DX-118394: Add configurable row and byte limits to RunSqlQuery by aniket-s-kulkarni · Pull Request #97 · dremio/dremio-mcp

aniket-s-kulkarni · 2026-04-08T15:02:29Z

Summary

Adds max_result_rows (default 500) and max_result_bytes (default 200 KB) fields to Dremio settings model, configurable via YAML and env vars (DREMIOAI_DREMIO__MAX_RESULT_ROWS, DREMIOAI_DREMIO__MAX_RESULT_BYTES)
Adds run_query_capped() to sql.py — fetches at most max_result_rows rows from Dremio (0 = unlimited), all existing callers of run_query/get_results are unaffected
Updates RunSqlQuery.invoke() to use run_query_capped(), enforce a byte cap via per-row JSON counting, and return structured truncation metadata when limits fire; response is unchanged when no truncation occurs

Test plan

JIRA

https://dremio.atlassian.net/browse/DX-118394

Reviewer Verdict

APPROVE (after trivial fixes: itertools.chain bug in run_query_capped, MCP return type annotation, and updated mock targets in 2 existing tests)

🤖 Generated with Claude Code

Add max_result_rows (default 500) and max_result_bytes (default 200 KB) settings to Dremio model. Add run_query_capped() to sql.py that fetches at most max_result_rows rows. Update RunSqlQuery.invoke() to use run_query_capped(), enforce byte cap, and return structured truncation metadata (truncated, total_rows, returned_rows, truncation_reason) when limits are hit. Both limits can be disabled by setting to 0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ssaumitra

The functionality already exists in the Dremio query engine and REST API. I request to use that instead.

If we use limit parameter in the Job Results API (https://docs.dremio.com/current/reference/api/job/job-results/) the limits will be applied at the query plan level. Query planner will optimize the query to read less number of Parquet files, the Dremio query execution cost will be lower and Dollar cost/compute cost for the end user will also be lower. On the other hand, if we use limit functionality in MCP server layer, Dremio query engine will run the full cost query. And later trim the output. So user will pay full dollar cost for running the entire query and receive truncated results.

Double implementation of the same functionality at two different services will also end up confusing the end users.

We already call Job results API over here so the MCP integration will also be straight forward.

ssaumitra requested changes Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DX-118394: Add configurable row and byte limits to RunSqlQuery#97

DX-118394: Add configurable row and byte limits to RunSqlQuery#97
aniket-s-kulkarni wants to merge 1 commit into
mainfrom
DX-118394-runsqlquery-result-limits

aniket-s-kulkarni commented Apr 8, 2026

Uh oh!

ssaumitra left a comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

aniket-s-kulkarni commented Apr 8, 2026

Summary

Test plan

JIRA

Reviewer Verdict

Uh oh!

ssaumitra left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants