Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/enhance-rag-agent-response-api #1972

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

feature/enhance-rag-agent-response-api #1972

wants to merge 15 commits into from

Conversation

emrgnt-cmplxty
Copy link
Contributor

@emrgnt-cmplxty emrgnt-cmplxty commented Feb 12, 2025

Important

Enhances RAG agent response API with streaming support, dynamic citation handling, and improved search strategies, alongside comprehensive testing.

  • Streaming Enhancements:
    • Introduces mock_rag_sse_generator_with_relabeling() and mock_rag_sse_generator() for simulating SSE streams with citation relabeling.
    • Adds CitationRelabeler class for dynamic citation index reassignment.
    • Implements SSE event models in responses.py for SearchResultsEvent, MessageEvent, CitationEvent, FinalAnswerEvent, ToolCallEvent, and ToolResultEvent.
  • Search and Citation Handling:
    • Updates extract_citations() and reassign_citations_in_order() in base_utils.py to handle multiple references within brackets.
    • Enhances map_citations_to_collector() to map citations to search results accurately.
    • Adds num_sub_queries to SearchSettings for HyDE and RAG Fusion strategies.
  • Testing:
    • Adds integration tests in test_agent.py and test_retrieval.py for agent responses and search functionalities.
    • Includes unit tests in test_citations.py and test_rag_stream.py to validate citation extraction, relabeling, and SSE streaming logic.

This description was created by Ellipsis for 74625bb. It will automatically update as commits are pushed.

@emrgnt-cmplxty emrgnt-cmplxty changed the title Dev feature/enhance-rag-agent-response-api Feb 13, 2025
@emrgnt-cmplxty emrgnt-cmplxty marked this pull request as ready for review February 13, 2025 22:02
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 74625bb in 2 minutes and 5 seconds

More details
  • Looked at 71 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 drafted comments based on config settings.
1. py/tests/integration/test_agent.py:10
  • Draft comment:
    Ensure citations list is non-empty before accessing index [0].
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
    This is a test file where we're explicitly testing that the API returns citations in a specific order. If citations were empty, we would want the test to fail. Adding a null check would mask the real issue if the API stopped returning citations. The assertion error message "Expected first citation to first doc" makes it clear that citations are expected.
    The comment raises a valid point about defensive programming. In production code, this kind of null check would be good practice.
    However, this is test code that is specifically designed to verify the presence and order of citations. Adding null checks would make the test less effective at catching real issues.
    Delete the comment because it suggests adding defensive programming to a test that is explicitly designed to fail if citations are missing.
2. py/tests/integration/test_agent.py:20
  • Draft comment:
    Ensure citations list is non-empty before accessing index [0].
  • Reason this comment was not posted:
    Marked as duplicate.
3. py/tests/integration/test_agent.py:4
  • Draft comment:
    The function name 'test_agentic_citations_0' might be a typo. Consider renaming if 'agentic' was unintentional.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50%
    None
4. py/tests/integration/test_agent.py:10
  • Draft comment:
    Directly indexing citations[0] can raise an IndexError if the list is empty. Consider asserting the list's length before indexing.
  • Reason this comment was not posted:
    Marked as duplicate.
5. py/tests/integration/test_agent.py:58
  • Draft comment:
    Remove or replace the print statement; consider using a logger to avoid cluttering test output.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_DKwIEXfEO3q70sXc


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

assistant_msg = response.results.messages[-1]
assert "Socrates" in assistant_msg.content

print("response.results.messages = ", response.results.messages)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove debug print; tests should not rely on console output.

Suggested change
print("response.results.messages = ", response.results.messages)

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 3c63139 in 3 minutes and 53 seconds

More details
  • Looked at 5351 lines of code in 24 files
  • Skipped 1 files when reviewing.
  • Skipped posting 6 drafted comments based on config settings.
1. py/tests/unit/test_rag_stream.py:155
  • Draft comment:
    The approach of scanning the entire partial_text_buffer for bracket references may lead to overlapping detections. Consider maintaining a pointer to avoid reprocessing parts of the buffer.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
2. py/tests/unit/test_rag_stream.py:668
  • Draft comment:
    The expected order of new bracket labels in test_sse_relabeling_multiple_distinct_brackets is tightly coupled to implementation details. Consider clarifying the intended mapping strategy.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
    The comment raises a valid point about test expectations being coupled to implementation. However, the test docstring already acknowledges this explicitly and the test verifies the actual behavior rather than enforcing a specific mapping strategy. The comment doesn't suggest a clear actionable improvement beyond what's already addressed in the docstring.
    The comment identifies a real issue about test/implementation coupling. Maybe there should be more explicit documentation about the intended mapping strategy.
    The test already handles this well by explicitly acknowledging the implementation dependency in the docstring and testing the actual behavior rather than enforcing a specific strategy. The docstring serves as the clarification the comment is asking for.
    Delete the comment since the test docstring already provides the clarification being requested and the test appropriately verifies actual behavior rather than enforcing a specific mapping strategy.
3. py/shared/api/models/retrieval/responses.py:137
  • Draft comment:
    The example for 'generated_answer' is wrapped in curly braces (as a set) instead of a plain string. It should be a simple string like "The capital of France is Paris."
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. py/shared/utils/base_utils.py:450
  • Draft comment:
    Use 'if label is not None' rather than 'if label != None' for an idiomatic Python null check.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. py/tests/unit/test_rag_stream.py:137
  • Draft comment:
    Duplicate definition of helper function _format_sse_event found (see also line 223). Consider consolidating into a single definition.
  • Reason this comment was not posted:
    Marked as duplicate.
6. py/r2r/r2r.toml:27
  • Draft comment:
    The tools configuration was changed to only include 'content'. Please confirm if the removal of other tools (like 'local_search') is intentional.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
    This comment violates one of the key rules - it's asking the author to confirm their intention. The change is clear and deliberate, with the old configuration even being preserved in comments. The author clearly meant to make this change, and asking for confirmation adds no value.
    Maybe the removal of local_search could have serious implications for the system's functionality that should be discussed?
    If there were serious implications, that would be a different comment pointing out specific issues. Simply asking for confirmation doesn't add value.
    Delete this comment as it merely asks for confirmation of an intentional change, which violates our commenting rules.

Workflow ID: wflow_PPYAzygkr45P5Rz2


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

"object": "rag.final_answer",
"generated_answer": partial_buffer,
"citations": [], # would be your mapped citations
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function _format_sse_event is defined twice. Consolidate its definition to avoid inconsistencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants