-
Notifications
You must be signed in to change notification settings - Fork 228
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug Description
- When evaluating query responses using TruLlama with feedback functions (e.g., groundedness, answer relevance, and context relevance), printing or processing the evaluation feedback objects (such as those obtained via get_feedback_result) leads to a RecursionError. It appears that the underlying pydantic models used in TruLlama have circular references causing the repr method to recurse indefinitely.
To Reproduce
- Set up a RetrieverQueryEngine and initialize TruLlama with feedback functions.
- Run a sample query and capture the evaluation feedback by calling get_feedback_result on the last feedback record.
- Attempt to print the feedback object, which triggers the error.
A minimal reproducible example:
from trulens.core import TruSession
from trulens.apps.llamaindex import TruLlama
from trulens.core import Feedback
from trulens.providers.openai import OpenAI
from trulens.dashboard.display import get_feedback_result
import numpy as np
# Assume query_engine is already initialized and configured
session = TruSession()
session.reset_database()
provider = OpenAI(api_key="YOUR_OPENAI_API_KEY")
context = TruLlama.select_context(query_engine)
f_groundedness = Feedback(provider.groundedness_measure_with_cot_reasons, name="Groundedness")\
.on(context.collect()).on_output()
f_answer_relevance = Feedback(provider.relevance_with_cot_reasons, name="Answer Relevance")\
.on_input_output()
f_context_relevance = Feedback(provider.context_relevance_with_cot_reasons, name="Context Relevance")\
.on_input().on(context).aggregate(np.mean)
tru_query_engine_recorder = TruLlama(
query_engine,
app_name="Example_App",
app_version="base",
feedbacks=[f_groundedness, f_answer_relevance, f_context_relevance],
)
with tru_query_engine_recorder as recording:
response = query_engine.query("example query")
last_record = recording.records[-1]
# This line triggers the RecursionError when printing the feedback object
get_feedback_result(last_record, "Context Relevance")
- Below is a minimal snippet containing the functions and code blocks that seem to trigger the recursion error. This snippet includes the key parts from the query_model function -- specifically the parts that print out pydantic objects using their default repr, which appears to be causing the infinite recursion
import reprlib
import re
import numpy as np
from trulens.core import TruSession
from trulens.apps.llamaindex import TruLlama
from trulens.core import Feedback
from trulens.providers.openai import OpenAI
from trulens.dashboard.display import get_feedback_result
# Assume query_engine is already defined and initialized elsewhere
def query_model(user_query):
max_attempts = 3
attempt = 0
while attempt < max_attempts:
# First query execution
response = query_engine.query(user_query)
print("\n" + "="*50)
print("DEBUG: query_engine.query() 실행 결과")
print(f"response 데이터 타입: {type(response)}")
# The direct print below triggers __repr__ on a pydantic model and can cause RecursionError.
print(f"response 내용: {response}")
print("="*50 + "\n")
# Initialize a new TruLens session and collect feedback
session = TruSession()
session.reset_database()
provider = OpenAI(api_key="YOUR_OPENAI_API_KEY")
context = TruLlama.select_context(query_engine)
f_groundedness = Feedback(provider.groundedness_measure_with_cot_reasons, name="Groundedness")\
.on(context.collect()).on_output()
f_answer_relevance = Feedback(provider.relevance_with_cot_reasons, name="Answer Relevance")\
.on_input_output()
f_context_relevance = Feedback(provider.context_relevance_with_cot_reasons, name="Context Relevance")\
.on_input().on(context).aggregate(np.mean)
tru_query_engine_recorder = TruLlama(
query_engine,
app_name="Example_App",
app_version="base",
feedbacks=[f_groundedness, f_answer_relevance, f_context_relevance],
)
with tru_query_engine_recorder as recording:
# Re-run the query within the feedback context
response = query_engine.query(user_query)
print("\n" + "="*50)
print("DEBUG: query_engine.query() 재실행 결과")
print(f"response 데이터 타입: {type(response)}")
# Again, printing the full response here may trigger recursion in __repr__
print(f"response 내용: {response}")
print("="*50 + "\n")
last_record = recording.records[-1]
get_feedback_result(last_record, "Context Relevance")
# Printing the leaderboard directly may also trigger recursion
leaderboard_df = session.get_leaderboard()
print("Leaderboard:", leaderboard_df)
# ... additional logic for checking feedback values and refining the query ...
attempt += 1
return "⚠️ 죄송합니다, 답변을 찾을 수 없습니다."
python
# Note:
# The culprit lines here are the print statements that directly print the response and leaderboard.
# They invoke __repr__ on pydantic objects, which (due to circular references)
# leads to a RecursionError.
**Expected behavior**
- The evaluation feedback should be processed and printed (or otherwise used) without causing a RecursionError. The repr implementation for these feedback objects should safely handle circular references (or be omitted from debug output).
**Relevant Logs/Tracebacks**
File "/home/jovyan/.venv/torch2.2.2-py3.11-cuda12.1/lib/python3.11/site-packages/pydantic/_internal/_repr.py", line 61, in __repr_str__
return join_str.join(repr(v) if a is None else f'{a}={v!r}' for a, v in self.__repr_args__())
RecursionError: maximum recursion depth exceeded

**Environment:**
- OS: [e.g., Ubuntu Linux, Windows, etc.]
- Python Version: 3.11.0rc1
- TruLens version: 1.4.4
- Other relevant libraries:
- pydantic version: 2.10.6
- llama_index version: 0.12.22
**Additional context**
- I attempted to work around this issue by summarizing debug output using reprlib.repr(), which only hides the problem rather than resolving it. A proper fix might involve modifying the repr methods of the involved pydantic models or providing a safe debug output option for feedback objects to avoid infinite recursion. Any help or guidance on a proper resolution would be greatly appreciated.
dosubot
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working