Improvement: Enhance ask_line tool by adding PR review comment threads as context #1687

benedict-lee · 2025-04-09T15:09:51Z

User description

Background.
When using the ask_line tool, conversations from existing author and reviewers are not included in the context.

Changes

Implemented functionality to collect entire conversation history from code review threads
- Applied hierarchical thread search algorithm that includes sibling comments and conversation threads
- Maintained natural conversation flow through chronological sorting
Enhanced context awareness by including conversation history in AI prompts
- Enabled continuous conversations with preserved question-answer flow

PR Review thread example

Prompt example

PR Type

Enhancement, Tests

Description

Added functionality to retrieve and process review thread comments.
Enhanced AI prompts with conversation history for better context.
Updated configuration to enable conversation history usage.
Improved PR line question handling with review thread integration.

Changes walkthrough 📝

Relevant files

Enhancement

git_provider.py `Add methods for review comment and thread retrieval` pr_agent/git_providers/git_provider.py Added abstract methods for retrieving review comments and threads. Introduced methods to fetch review comment by ID and thread comments. Enhanced support for handling review threads in Git provider interface.	+9/-0
github_provider.py `Implement GitHub-specific review thread comment handling` pr_agent/git_providers/github_provider.py Implemented methods to fetch review comments and threads using PyGitHub. Added logic to retrieve ancestors, descendants, and sibling comments. Enabled chronological sorting of thread comments. Included error handling for comment retrieval failures.	+124/-0
pr_line_questions.py `Integrate conversation history into PR line questions` pr_agent/tools/pr_line_questions.py Added logic to load conversation history from review threads. Integrated conversation history into AI prompt variables. Enhanced error handling for conversation history loading. Introduced settings for enabling conversation history usage.	+75/-0
pr_line_questions_prompts.toml `Update AI prompts to include conversation history` pr_agent/settings/pr_line_questions_prompts.toml Updated AI prompt template to include conversation history. Added conditional block for displaying previous discussions. Enhanced prompt clarity with context from review threads.	+9/-0

Configuration changes

configuration.toml `Add configuration for conversation history usage` pr_agent/settings/configuration.toml Added `use_conversation_history` setting under `pr_questions`. Enabled configuration for conversation history usage.	+1/-0

Need help?
Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
Check out the documentation for more information.

qodo-merge-pro-for-open-source · 2025-04-09T15:10:21Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Performance Concern The implementation fetches all PR comments multiple times when processing thread comments. Consider caching the result of `self.pr.get_comments()` to avoid redundant API calls, especially for PRs with many comments. all_comments = list(self.pr.get_comments()) for comment in all_comments: if comment.id == comment_id: return comment return None except Exception as e: get_logger().warning(f"Failed to get review comment {comment_id}, error: {e}") return None def get_review_id_by_comment_id(self, comment_id: int): """ Finds the review ID that a comment belongs to based on its comment ID. Args: comment_id: Review comment ID Returns: Review ID or None (if not found) """ try: comment = self.get_review_comment_by_id(comment_id) if comment: return getattr(comment, 'pull_request_review_id', None) return None except Exception as e: get_logger().warning(f"Failed to get review ID for comment {comment_id}, error: {e}") return None def get_review_thread_comments(self, comment_id: int): """ Retrieves all comments in the thread that a specific comment belongs to. Args: comment_id: Review comment ID Returns: List of comments in the same thread """ try: # Get comment information comment = self.get_review_comment_by_id(comment_id) if not comment: return [] # get all comments all_comments = list(self.pr.get_comments()) Error Handling The conversation history loading function has nested try-except blocks that may mask specific errors. Consider refining the error handling to provide more granular error messages and better debugging information. try: comment_id = get_settings().get('comment_id', '') file_path = get_settings().get('file_name', '') line_number = get_settings().get('line_end', '') # return if no comment id or file path and line number if not (comment_id or (file_path and line_number)): return # initialize conversation history conversation_history = [] if hasattr(self.git_provider, 'get_review_thread_comments') and comment_id: try: # get review thread comments thread_comments = self.git_provider.get_review_thread_comments(comment_id) # current question id (this question is excluded from the context) current_question_id = comment_id # generate conversation history from the comments for comment in thread_comments: # skip empty comments body = getattr(comment, 'body', '') if not body or not body.strip(): continue # except for current question if current_question_id and str(comment.id) == str(current_question_id): continue # remove the AI command (/ask etc) from the beginning of the comment (optional) clean_body = body if clean_body.startswith('/'): clean_body = clean_body.split('\n', 1)[-1] if '\n' in clean_body else '' if not clean_body.strip(): continue # author info user = comment.user author = user.login if hasattr(user, 'login') else 'Unknown' # confirm if the author is the current user (AI vs user) is_ai = 'bot' in author.lower() or '[bot]' in author.lower() role = 'AI' if is_ai else 'User' # append to the conversation history conversation_history.append(f"{role} ({author}): {clean_body}") # transform the conversation history to a string if conversation_history: self.vars["conversation_history"] = "\n\n".join(conversation_history) get_logger().info(f"Loaded {len(conversation_history)} comments from the code review thread") else: self.vars["conversation_history"] = "" except Exception as e: get_logger().warning(f"Failed to get review thread comments: {e}") self.vars["conversation_history"] = "" except Exception as e: get_logger().error(f"Error loading conversation history: {e}") self.vars["conversation_history"] = ""

qodo-merge-pro-for-open-source · 2025-04-09T15:10:56Z

PR Code Suggestions ✨

Latest suggestions up to 9c06b6b

Category	Suggestion	Impact
Possible issue	Validate line number format The code checks if parameters are missing by using `all()`, but doesn't validate that `line_number` is a valid integer. Since `line_number` comes from `get_settings().get('line_end', '')`, it could be an empty string which would pass the check but cause errors when used for comparison. pr_agent/tools/pr_line_questions.py [111-113] # early return if any required parameter is missing -if not all([comment_id, file_path, line_number]): +if not comment_id or not file_path or not line_number: return +# ensure line_number is an integer +try: + line_number = int(line_number) +except (ValueError, TypeError): + get_logger().warning(f"Invalid line number: {line_number}") + return + Apply this suggestion Suggestion importance[1-10]: 9 __ Why: This suggestion identifies a critical issue where line_number might be an empty string that passes the existence check but would cause errors when used for comparison or conversion. The improved code properly validates and converts line_number to an integer, preventing runtime errors.	High
Possible issue	Handle missing line number The code doesn't handle the case where `line_number` is `None`, which could happen if the comment doesn't have a "line" or "original_line" attribute. This would cause the filter to fail silently and potentially return incorrect results. pr_agent/git_providers/github_provider.py [452-459] # Get all comments all_comments = list(self.pr.get_comments()) # Filter comments on the same line of the same file -thread_comments = [ - c for c in all_comments - if c.path == file_path and (c.raw_data.get("line") == line_number or c.raw_data.get("original_line") == line_number) -] +thread_comments = [] +if file_path and line_number is not None: + thread_comments = [ + c for c in all_comments + if c.path == file_path and (c.raw_data.get("line") == line_number or c.raw_data.get("original_line") == line_number) + ] Apply this suggestion Suggestion importance[1-10]: 8 __ Why: The suggestion addresses a potential bug where line_number could be None, causing silent failures in the filtering logic. This is a valid concern as the code extracts line_number from comment.raw_data which might not contain the expected keys.	Medium
Possible issue	Avoid circular import dependency The code imports `GithubProvider` directly from github_provider module, but this creates a circular import issue since pr_line_questions.py is already imported in other modules. Use the more generic approach of checking provider type with isinstance() without the explicit import. pr_agent/tools/pr_line_questions.py [61-63] -# currently only supports GitHub provider -if get_settings().pr_questions.use_conversation_history and isinstance(self.git_provider, GithubProvider): +# check if conversation history is enabled and provider supports it +if get_settings().pr_questions.use_conversation_history and hasattr(self.git_provider, 'get_review_thread_comments'): self._load_conversation_history() Apply this suggestion Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies a potential circular import issue. Replacing the explicit GithubProvider import with a more generic check using hasattr() is a better approach that avoids dependency problems while maintaining the same functionality.	Medium
Possible issue	Fix incorrect return type The method signature indicates it returns `list[dict]` but it actually returns a list of comment objects, not dictionaries. This mismatch between the declared return type and actual return value could cause type errors when using the function. pr_agent/git_providers/github_provider.py [432-441] -def get_review_thread_comments(self, comment_id: int) -> list[dict]: +def get_review_thread_comments(self, comment_id: int) -> list: """ Retrieves all comments in the same line as the given comment. Args: comment_id: Review comment ID Returns: - List of comments on the same line + List of comment objects on the same line """ Apply this suggestion Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies a type annotation mismatch. The method signature declares it returns list[dict] but actually returns a list of comment objects. This fix improves type safety and prevents potential confusion or errors for users of this API.	Medium
Update

Previous suggestions

✅ Suggestions up to commit b53d277

Category	Suggestion	Impact
Possible issue	Prevent infinite recursion The recursive `find_descendants` function can cause a stack overflow for deeply nested comment threads. It's missing a check for circular references in the reply chain, which could lead to infinite recursion. pr_agent/git_providers/github_provider.py [503-511] # Recursively find all descendant comments (collect all replies to the comment) -def find_descendants(cid): +def find_descendants(cid, visited=None): + if visited is None: + visited = set() + if cid in visited: + return [] # Prevent infinite recursion + visited.add(cid) descendants = [] for c in all_comments: if getattr(c, 'in_reply_to_id', None) == cid: descendants.append(c.id) - descendants.extend(find_descendants(c.id)) + descendants.extend(find_descendants(c.id, visited)) return descendants Suggestion importance[1-10]: 9 __ Why: This suggestion addresses a critical issue that could cause stack overflow errors in production. Adding a visited set to track already processed comments prevents infinite recursion in case of circular references in the comment chain, which is a significant improvement for stability.	High
	Fix AI author identification The current method to identify AI comments is too simplistic and may misclassify users with 'bot' in their username as AI. This could lead to incorrect conversation history formatting and confuse the AI model about who said what. pr_agent/tools/pr_line_questions.py [149-151] # confirm if the author is the current user (AI vs user) -is_ai = 'bot' in author.lower() or '[bot]' in author.lower() +# Get the bot username from settings or use a more robust identification method +bot_username = getattr(get_settings(), 'bot_username', None) +is_ai = (bot_username and author == bot_username) or '[bot]' in author role = 'AI' if is_ai else 'User' Suggestion importance[1-10]: 7 __ Why: The suggestion improves the accuracy of identifying AI comments by using a more robust method that relies on configuration rather than simple string matching. This prevents misclassification of users with 'bot' in their username and improves the quality of conversation history.	Medium
	✅ ~~Fix ID comparison logic~~ Suggestion Impact: The commit addressed the same issue as the suggestion but implemented a different solution. Instead of converting both IDs to strings, the commit directly compares the IDs without any type conversion, assuming they are already of the same type. code diff: - if current_question_id and str(comment.id) == str(current_question_id): - continue + # except for current question + if current_question_id and comment.id == current_question_id: The code compares string and integer IDs without proper type conversion, which could lead to missed matches. The comparison `str(comment.id) ==` `str(current_question_id)` assumes both values can be converted to strings, but if `current_question_id` is None or not a valid ID format, this could cause errors. pr_agent/tools/pr_line_questions.py [123-135] # current question id (this question is excluded from the context) current_question_id = comment_id # generate conversation history from the comments for comment in thread_comments: # skip empty comments body = getattr(comment, 'body', '') if not body or not body.strip(): continue # except for current question - if current_question_id and str(comment.id) == str(current_question_id): + if current_question_id and current_question_id.isdigit() and int(current_question_id) == comment.id: continue `[Suggestion has been applied]` Suggestion importance[1-10]: 6 __ Why: The suggestion improves the robustness of the ID comparison by properly handling type conversion and validation. This prevents potential errors when comparing comment IDs of different types and ensures correct filtering of the current question from conversation history.	Low
Learned best practice	Add null safety checks when traversing parent-child relationships to prevent potential runtime errors The current implementation assumes that `in_reply_to_map` always contains valid entries and doesn't check if `parent_id` is None before using it as the new `current` value. This could lead to runtime errors if the data structure is inconsistent. Add a null check to ensure `parent_id` is not None before continuing the loop. pr_agent/git_providers/github_provider.py [502-510] # Recursively find all ancestor comments (collect comment's ancestors) def find_ancestors(cid): ancestors = [] current = cid while current in in_reply_to_map: parent_id = in_reply_to_map[current] + if parent_id is None: + break ancestors.append(parent_id) current = parent_id return ancestors Suggestion importance[1-10]: 6	Low

Co-authored-by: Prateek <[email protected]>

ofir-frd

@benedict-lee ,

Thank you for this significant contribution.
I've made a couple of review comments on the code changes.
While reviewing, I noticed that the current version contains an error that prevents it from running. Please take a look at it.

pr_agent/git_providers/git_provider.py

pr_agent/git_providers/github_provider.py

pr_agent/tools/pr_line_questions.py

Co-authored-by: ofir-frd <[email protected]>

benedict-lee · 2025-04-10T13:06:55Z

@ofir-frd
I apologize for sending a PR with embarrassing code at midnight yesterday.
In truth, I quickly implemented it using vibecoding and sent the PR without sufficient in-depth review. I've realized once again that even when using vibecoding, we must thoroughly examine AI-generated code.
There was absolutely no other intention behind it, so I hope there's no misunderstanding.

Since I haven't been working with Python for long (PR agent was my first introduction to Python), my refactored code might still be lacking.
I'm sorry, but please check it again and provide feedback.

ofir-frd

@benedict-lee
Thank you for your contribution.

The code logic was improved significantly and it now works.
I do have a couple more comments to reduce API calls, code complexity and readability.
Please have a look.

ofir-frd · 2025-04-20T05:29:43Z

pr_agent/settings/pr_line_questions_prompts.toml

@@ -49,10 +49,9 @@ Previous discussion on this code:
 ======
 {{ conversation_history|trim }}
 ======
-Consider both the previous review comments from authors and reviewers, as well as any previous questions and answers about this code. The "Previous Question" and "Previous AI Answer" show earlier interactions about the same code. Use this context to provide a more informed and consistent answer.
+Use this prior discussion context to provide a consistent and informed answer.


Please make this more informative by describing the conversation structure and how the model should handle it.

ofir-frd · 2025-04-20T05:31:15Z

pr_agent/tools/pr_line_questions.py

@@ -16,7 +16,7 @@
 from pr_agent.git_providers.git_provider import get_main_pr_language
 from pr_agent.log import get_logger
 from pr_agent.servers.help import HelpMessage
-
+from pr_agent.git_providers.github_provider import GithubProvider


Please place this import in order with the others, next to the from pr_agent.git_providers... imports.

ofir-frd · 2025-04-20T05:38:10Z

pr_agent/tools/pr_line_questions.py

@@ -103,70 +101,41 @@ async def run(self):

    def _load_conversation_history(self):


This function currently introduces unnecessary side effects, making the code harder to read and follow. Let's build the conversation string inside the function and return it, so the run function can handle its output.

Suggested change

def _load_conversation_history(self):

def _load_conversation_history(self) -> str:

ofir-frd · 2025-04-20T05:39:23Z

pr_agent/tools/pr_line_questions.py

        except Exception as e:
-            get_logger().error(f"Error loading conversation history: {e}")
-            self.vars["conversation_history"] = ""
+            get_logger().error(f"Error processing conversation history, error: {e}")


Let's return an empty string in case the code crashes.

ofir-frd · 2025-04-20T05:44:11Z

pr_agent/git_providers/github_provider.py

-                sibling_ids = find_all_descendants_of_ancestor(ancestor_id)
-                thread_ids.update(sibling_ids)
+            # Get all comments
+            all_comments = list(self.pr.get_comments())


Here, you're collecting all PR comments. You can expect to find the given comment_id within this list, making the comment = self.pr.get_comment(comment_id) code line redundant and reducing the number of API calls by half, from 2 to 1.

ofir-frd · 2025-04-20T06:07:48Z

pr_agent/tools/pr_line_questions.py

+        if not all([comment_id, file_path, line_number]):
+            return


Lets log this

ofir-frd · 2025-04-20T06:10:59Z

pr_agent/git_providers/github_provider.py

+            thread_comments = [
+                c for c in all_comments 
+                if c.path == file_path and (c.raw_data.get("line") == line_number or c.raw_data.get("original_line") == line_number)
+            ]


This is too general; let's work with comment IDs instead of non-unique attributes.
Hint: look for in_reply_to_id

ofir-frd · 2025-04-20T06:22:00Z

pr_agent/tools/pr_line_questions.py

-                    self.vars["conversation_history"] = ""
+            # transform and save conversation history
+            if conversation_history:
+                self.vars["conversation_history"] = "\n\n".join(conversation_history)


Please improve the sense of order in this string, for example, count the comments.

…ine numbers

benedict-lee · 2025-04-21T08:21:26Z

@ofir-frd Thank you for your kind review! I've implemented all the refactoring changes you suggested.
Throughout the process of improving the code logic, I learned a lot about minimizing API calls, eliminating side effects, and applying consistent code patterns.
Thanks to your review, the code quality and readability have been significantly enhanced, and I'll continue to apply these patterns in the future. Really appreciate your valuable feedback!

ofir-frd

@benedict-lee

Looking good!
Last short round of comments.

pr_agent/git_providers/github_provider.py

Co-authored-by: ofir-frd <[email protected]>

ofir-frd

Thanks @benedict-lee !
Code approved!

improve ask_line tool(add conversation history context)

b53d277

qodo-merge-pro-for-open-source bot added the Review effort 3/5 label Apr 9, 2025

mrT23 requested a review from ofir-frd April 9, 2025 18:14

Update pr_agent/tools/pr_line_questions.py

8952459

Co-authored-by: Prateek <[email protected]>

ofir-frd requested changes Apr 10, 2025

View reviewed changes

benedict-lee and others added 3 commits April 10, 2025 19:41

refactor: Add GitHub provider check for conversation history

6bf093a

Co-authored-by: ofir-frd <[email protected]>

refactor: Validate all required parameters before proceeding

c5165d9

Co-authored-by: ofir-frd <[email protected]>

Apply PR review feedback: Code style and functionality improvements

9c06b6b

benedict-lee requested a review from ofir-frd April 14, 2025 04:04

ofir-frd requested changes Apr 20, 2025

View reviewed changes

benedict-lee added 4 commits April 21, 2025 16:28

Improve comment thread retrieval by using in_reply_to_id instead of l…

a434d0a

…ine numbers

Reorganize imports according to Python conventions

e11c2e1

Improve conversation history handling and prompts for line questions

8b4bf49

Improve conversation history formatting with numbered comments

9906ec3

ofir-frd requested changes Apr 23, 2025

View reviewed changes

pr_agent/git_providers/github_provider.py Outdated Show resolved Hide resolved

pr_agent/git_providers/github_provider.py Outdated Show resolved Hide resolved

pr_agent/git_providers/github_provider.py Outdated Show resolved Hide resolved

benedict-lee and others added 3 commits April 24, 2025 11:21

Improve get_review_thread_comments method implementation

29d4fe5

Co-authored-by: ofir-frd <[email protected]>

mprove get_review_thread_comments method implementation

ddb94ec

Co-authored-by: ofir-frd <[email protected]>

mprove get_review_thread_comments method implementation

c35942c

Co-authored-by: ofir-frd <[email protected]>

qodo-ai deleted a comment from qodo-merge-pro-for-open-source bot Apr 24, 2025

ofir-frd approved these changes Apr 24, 2025

View reviewed changes

ofir-frd merged commit 5d5b572 into qodo-ai:main Apr 24, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvement: Enhance ask_line tool by adding PR review comment threads as context #1687

Improvement: Enhance ask_line tool by adding PR review comment threads as context #1687

benedict-lee commented Apr 9, 2025 •

edited by qodo-merge-pro-for-open-source bot

Loading

qodo-merge-pro-for-open-source bot commented Apr 9, 2025

qodo-merge-pro-for-open-source bot commented Apr 9, 2025 •

edited

Loading

ofir-frd left a comment

benedict-lee commented Apr 10, 2025

ofir-frd left a comment

ofir-frd Apr 20, 2025

ofir-frd Apr 20, 2025

ofir-frd Apr 20, 2025

ofir-frd Apr 20, 2025

ofir-frd Apr 20, 2025

ofir-frd Apr 20, 2025

ofir-frd Apr 20, 2025 •

edited

Loading

ofir-frd Apr 20, 2025

benedict-lee commented Apr 21, 2025

ofir-frd left a comment

ofir-frd left a comment

		@@ -103,70 +101,41 @@ async def run(self):

		def _load_conversation_history(self):

	def _load_conversation_history(self):
	def _load_conversation_history(self) -> str:

Improvement: Enhance ask_line tool by adding PR review comment threads as context #1687

Improvement: Enhance ask_line tool by adding PR review comment threads as context #1687

Conversation

benedict-lee commented Apr 9, 2025 • edited by qodo-merge-pro-for-open-source bot Loading

User description

PR Type

Description

Changes walkthrough 📝

qodo-merge-pro-for-open-source bot commented Apr 9, 2025

PR Reviewer Guide 🔍

qodo-merge-pro-for-open-source bot commented Apr 9, 2025 • edited Loading

PR Code Suggestions ✨

Previous suggestions

ofir-frd left a comment

Choose a reason for hiding this comment

benedict-lee commented Apr 10, 2025

ofir-frd left a comment

Choose a reason for hiding this comment

ofir-frd Apr 20, 2025

Choose a reason for hiding this comment

ofir-frd Apr 20, 2025

Choose a reason for hiding this comment

ofir-frd Apr 20, 2025

Choose a reason for hiding this comment

ofir-frd Apr 20, 2025

Choose a reason for hiding this comment

ofir-frd Apr 20, 2025

Choose a reason for hiding this comment

ofir-frd Apr 20, 2025

Choose a reason for hiding this comment

ofir-frd Apr 20, 2025 • edited Loading

Choose a reason for hiding this comment

ofir-frd Apr 20, 2025

Choose a reason for hiding this comment

benedict-lee commented Apr 21, 2025

ofir-frd left a comment

Choose a reason for hiding this comment

ofir-frd left a comment

Choose a reason for hiding this comment

benedict-lee commented Apr 9, 2025 •

edited by qodo-merge-pro-for-open-source bot

Loading

qodo-merge-pro-for-open-source bot commented Apr 9, 2025 •

edited

Loading

ofir-frd Apr 20, 2025 •

edited

Loading