Skip to content

fix(core): short-circuit Refine on query_satisfied when structured_answer_filtering is enabled#21398

Open
gautamvarmadatla wants to merge 1 commit into
run-llama:mainfrom
gautamvarmadatla:fix/refine-structured-answer-filtering-early-exit
Open

fix(core): short-circuit Refine on query_satisfied when structured_answer_filtering is enabled#21398
gautamvarmadatla wants to merge 1 commit into
run-llama:mainfrom
gautamvarmadatla:fix/refine-structured-answer-filtering-early-exit

Conversation

@gautamvarmadatla

Copy link
Copy Markdown
Contributor

Description

Added early-exit breaks to both the outer chunk loops and the inner repacked-subchunk loops in get_response, aget_response, and their helper methods. Also updated the helper methods to return (response, query_satisfied) so the outer loops can act on the satisfaction signal and stop processing additional chunks.

Fixes #21397

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • No

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Apr 16, 2026
@gautamvarmadatla

Copy link
Copy Markdown
Contributor Author

Ci failing due to intergration tests not core

@VANDRANKI VANDRANKI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is correct and complete. Both sync and async paths are updated identically.

Outer loop short-circuit (get_response / aget_response): adding if self._structured_answer_filtering and query_satisfied: break after each chunk's response call is the key fix for issue #21397. Without it, the loop always processes every text chunk even after the query is satisfied.

Inner loop short-circuit (_give_response_single, _refine_response_single): adding if self._structured_answer_filtering: break inside the if query_satisfied: block is also correct - it stops processing additional repacked text sub-chunks within the same outer chunk.

Return type change to Tuple[RESPONSE_TEXT_TYPE, bool]: threading query_satisfied through the return values is the right design. No caller modification needed outside get_response/aget_response since the internal methods are private.

query_satisfied = False moved before the inner for loop: correct improvement. In the original, it was reset to False at the start of each inner iteration, which masked the variable's state between iterations but was functionally harmless since the break fired before the next reset. Moving it before the loop is cleaner and makes the intent explicit.

return response, False for the early-exit path (avail_chunk_size < 0) in _refine_response_single: correct. If we couldn't fit the refine context, we return the unchanged response and query_satisfied=False so the outer loop keeps processing.

No test added: a test that creates a SummaryIndex with multiple documents (one that answers the query, others that should be skipped), enables structured_answer_filtering=True, and asserts that only one LLM call is made would directly verify the short-circuit and prevent regression. The issue's reproducer is self-contained and would work as a parametrized test. Not a blocker.

LGTM.

@VANDRANKI VANDRANKI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thorough fix. All four code paths are covered.

Approach: changing the return type of the four helper methods (, , , ) from to and propagating back to the outer loop is the right design. The outer loop now breaks as soon as a chunk satisfies the query.

Inner loop break: adding inside and when also prevents unnecessary sub-chunk processing within a single text chunk. Good.

**Deleted block in **: the removed block was unreachable dead code in the original: it updated a local variable after the structured response was already returned, with no subsequent iteration that could use the updated value. Removing it is correct and cleans up the logic.

Tests: five tests cover both sync and async paths, the inner sub-chunk loop short-circuit, the outer chunk loop short-circuit, and the negative case where must not short-circuit. Good coverage.

One gap not blocking merge: there is no test for the case where the refine path (not the first response) produces and causes the outer loop to break on a non-first chunk. The existing tests only verify the first-chunk short-circuit. A test with two chunks where the first returns and the second returns would give complete end-to-end coverage of the outer loop break on the refine path.

LGTM. Ready to merge.

@gautamvarmadatla gautamvarmadatla force-pushed the fix/refine-structured-answer-filtering-early-exit branch from 7260ed9 to 547ee1d Compare May 21, 2026 13:31
@dosubot dosubot Bot added size:XS This PR changes 0-9 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels May 21, 2026
@gautamvarmadatla

Copy link
Copy Markdown
Contributor Author

same reasons for pipeline failure as mentioned in #21445 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Refine ignores query_satisfied=True when structured_answer_filtering=True

2 participants