Enhanced Langfuse Integration: Trace Text Extraction & User Feedback#19
Enhanced Langfuse Integration: Trace Text Extraction & User Feedback#19MrOrz wants to merge 2 commits into
Conversation
…r feedback - Implement TextExtractionSpanProcessor in Python ADK for cleaner trace readability - Synchronize Trace ID between backend and frontend via custom metadata - Integrate Langfuse Web SDK in frontend - Add upvote/downvote scoring associated with trace ID Co-authored-by: MrOrz <108608+MrOrz@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the observability and user feedback mechanisms for the Cofacts AI system by deepening its integration with Langfuse. It introduces a backend component to simplify trace data by extracting human-readable text from complex JSON structures and ensures trace IDs are consistently propagated to the frontend. On the frontend, users can now directly rate agent responses, with this feedback being captured and associated with specific traces in Langfuse, providing a comprehensive loop for monitoring and improving AI performance. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively enhances the Langfuse integration by adding trace text extraction on the backend and user feedback capabilities on the frontend. The backend now includes the trace_id in responses, which the frontend correctly uses to link user feedback to specific traces. The code is generally well-structured, but I've identified a couple of opportunities for improvement in the Python backend to enhance code clarity, efficiency, and maintainability.
| if not llm_response.custom_metadata: | ||
| llm_response.custom_metadata = {} |
There was a problem hiding this comment.
The condition if not llm_response.custom_metadata: evaluates to True for both None and an empty dictionary. This means that if custom_metadata is already an empty dictionary, it will be needlessly reassigned. Using is None provides a more precise check for your intent to initialize the dictionary only when it's not yet set.
| if not llm_response.custom_metadata: | |
| llm_response.custom_metadata = {} | |
| if llm_response.custom_metadata is None: | |
| llm_response.custom_metadata = {} |
| modified = False | ||
|
|
||
| # Extract input text | ||
| input_value = attributes.get("input.value") | ||
| if input_value and isinstance(input_value, str): | ||
| try: | ||
| data = json.loads(input_value) | ||
| text = self._extract_text_from_data(data) | ||
| if text: | ||
| span.set_attribute("input.value", text) | ||
| modified = True | ||
| except json.JSONDecodeError: | ||
| pass | ||
|
|
||
| # Extract output text | ||
| output_value = attributes.get("output.value") | ||
| if output_value and isinstance(output_value, str): | ||
| try: | ||
| data = json.loads(output_value) | ||
| text = self._extract_text_from_data(data) | ||
| if text: | ||
| span.set_attribute("output.value", text) | ||
| modified = True | ||
| except json.JSONDecodeError: | ||
| pass |
There was a problem hiding this comment.
The logic for processing input.value and output.value is duplicated. Additionally, the modified variable is assigned but never used. You can improve this by refactoring the logic into a local helper function inside on_end. This will remove the code duplication and the unused variable, making the code cleaner and more maintainable.
| modified = False | |
| # Extract input text | |
| input_value = attributes.get("input.value") | |
| if input_value and isinstance(input_value, str): | |
| try: | |
| data = json.loads(input_value) | |
| text = self._extract_text_from_data(data) | |
| if text: | |
| span.set_attribute("input.value", text) | |
| modified = True | |
| except json.JSONDecodeError: | |
| pass | |
| # Extract output text | |
| output_value = attributes.get("output.value") | |
| if output_value and isinstance(output_value, str): | |
| try: | |
| data = json.loads(output_value) | |
| text = self._extract_text_from_data(data) | |
| if text: | |
| span.set_attribute("output.value", text) | |
| modified = True | |
| except json.JSONDecodeError: | |
| pass | |
| def _process_attribute(key: str) -> None: | |
| value = attributes.get(key) | |
| if not (value and isinstance(value, str)): | |
| return | |
| try: | |
| data = json.loads(value) | |
| text = self._extract_text_from_data(data) | |
| if text: | |
| span.set_attribute(key, text) | |
| except json.JSONDecodeError: | |
| pass | |
| _process_attribute("input.value") | |
| _process_attribute("output.value") |
… user feedback - Implement TextExtractionSpanProcessor in Python ADK for cleaner trace logs. - Synchronize OpenTelemetry trace IDs from backend to frontend via custom metadata. - Integrate Langfuse Web SDK in the frontend to capture user feedback (upvotes/downvotes). - Update service.template.yaml with required frontend environment variables. Co-authored-by: MrOrz <108608+MrOrz@users.noreply.github.com>
|
Let's isolate FE and BE changes. I will close this and restart a task that does only the feedback buttons. It seems that feedbacks would map to individual messages = ADK events, and there can be multiple agent messages in 1 turn. We may need to provide feedback on specific span rather than traces. |
This submission enhances the existing Langfuse integration to improve observability and user engagement.
Key changes:
TextExtractionSpanProcessorin the Python ADK. It intercepts OpenTelemetry spans and extracts plain text from the complex JSON structures typically sent as input/output, making the Langfuse session view much easier to browse.trace_idand includes it in thecustom_metadataof its responses. The frontend extracts this ID and stores it within each chat message, creating a reliable link between the UI and backend traces.AgentMessagecomponent. Users can now upvote or downvote agent responses, and these ratings are sent to Langfuse asuser_feedbackscores (1 or -1) associated with the specific trace ID of that message.langfuseJS SDK to the frontend dependencies.These changes provide a comprehensive feedback loop and significantly better debugging capabilities for the Cofacts AI system.
Fixes #14
PR created automatically by Jules for task 15597546857061770299 started by @MrOrz