Skip to content

adding a tag on screenshots for mouse_click coordinate actions #240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 24, 2025

Conversation

TLSDC
Copy link
Collaborator

@TLSDC TLSDC commented Apr 24, 2025

Description by Korbit AI

What change is being made?

Add functionality to tag screenshots with action coordinates when the action is a mouse click, by drawing a dot on the screenshot at the specified coordinates.

Why are these changes being made?

The changes enhance the interpretability of screenshots by visually indicating where mouse click actions occurred, thereby aiding in better analysis and debugging during experiment result inspections. This approach provides a clear visual representation without modifying the core logic of action recording or screenshot capturing.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

@TLSDC TLSDC requested review from recursix and Copilot April 24, 2025 17:34
Copy link

korbit-ai bot commented Apr 24, 2025

Based on your review schedule, I'll hold off on reviewing this PR until it's marked as ready for review. If you'd like me to take a look now, comment /korbit-review.

Your admin can change your review schedule in the Korbit Console

@TLSDC TLSDC marked this pull request as ready for review April 24, 2025 17:34
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds functionality to tag screenshots with a red dot when a "mouse_click" action is detected, giving a visual cue for coordinate-based actions.

  • Introduces the new function tag_screenshot_with_action to parse and render mouse_click coordinates on screenshots.
  • Updates the update_screenshot and update_screenshot_pair functions to apply the tagging functionality.

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Error Handling Missing Coordinate Parsing Error Handling ▹ view ✅ Fix detected
Functionality Inconsistent Screenshot Pair Annotation ▹ view
Files scanned
File Path Reviewed
src/agentlab/analyze/agent_xray.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

Comment on lines 540 to 544
coords = [c.strip() for c in coords]
if coords[0].startswith("x="):
coords[0] = coords[0][2:]
if coords[1].startswith("y="):
coords[1] = coords[1][2:]

This comment was marked as resolved.

Comment on lines +573 to +574
if s1 is not None:
s1 = tag_screenshot_with_action(s1, info.exp_result.steps_info[info.step].action)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent Screenshot Pair Annotation category Functionality

Tell me more
What is the issue?

The screenshot annotation logic in update_screenshot_pair() only annotates the first screenshot (s1) but not the second one (s2).

Why this matters

This inconsistency means users won't see click markers on the second screenshot, making it harder to track the progression of actions across screenshot pairs.

Suggested change ∙ Feature Preview
def update_screenshot_pair(som_or_not: str):
    global info
    s1 = get_screenshot(info, info.step, som_or_not)
    s2 = get_screenshot(info, info.step + 1, som_or_not)

    if s1 is not None:
        s1 = tag_screenshot_with_action(s1, info.exp_result.steps_info[info.step].action)
    if s2 is not None and info.step + 1 < len(info.exp_result.steps_info):
        s2 = tag_screenshot_with_action(s2, info.exp_result.steps_info[info.step + 1].action)
    return s1, s2
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

@TLSDC TLSDC requested a review from Copilot April 24, 2025 17:41
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new function to annotate screenshots with a red marker for mouse click coordinate actions, improving the visual context for debugging. The key changes include:

  • Introducing tag_screenshot_with_action() to parse mouse click actions and draw a marker on screenshots.
  • Updating update_screenshot() and update_screenshot_pair() to use the new tagging function.

Comment on lines 559 to 561
draw.ellipse((x - radius, y - radius, x + radius, y + radius), fill="red", outline="red")
except (ValueError, IndexError) as e:
warning(f"Failed to parse action '{action}': {e}")
Copy link
Preview

Copilot AI Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function tag_screenshot_with_action does not return the modified screenshot after drawing the marker. Consider adding a return statement at the end of the function (both after a successful draw and outside the if block) so that the updated image is returned.

Suggested change
draw.ellipse((x - radius, y - radius, x + radius, y + radius), fill="red", outline="red")
except (ValueError, IndexError) as e:
warning(f"Failed to parse action '{action}': {e}")
draw.ellipse((x - radius, y - radius, x + radius, y + radius), fill="red", outline="red")
return screenshot
except (ValueError, IndexError) as e:
warning(f"Failed to parse action '{action}': {e}")
return screenshot

Copilot uses AI. Check for mistakes.

Copy link
Collaborator

@recursix recursix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful!!! Thanks

@TLSDC TLSDC merged commit 8666268 into main Apr 24, 2025
3 checks passed
@TLSDC TLSDC deleted the tlsdc/xray_image_tagging branch April 24, 2025 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants