-
Notifications
You must be signed in to change notification settings - Fork 943
Fixed open action for Windows desktop.
#130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixed grammatical errors in the original prompt and added a warning for Windows systems.
WalkthroughDocstring update in gui_agents/s2_5/agents/grounding.py: The open method’s docstring clarifies platform intent (Linux/Darwin) and warns against Windows usage. No functional or control-flow changes. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (2)
gui_agents/s2_5/agents/grounding.py (2)
401-411: Avoid returning None on Windows and unknown platforms; add explicit guard.Right now, open() returns a string only for Linux/macOS. On Windows (and any other/unknown platform), it implicitly returns None, which is exactly the failure mode described in the PR (downstream code indexing into
code[0]). The docstring warning is helpful, but it’s not a code-level safeguard.Recommend: raise a clear runtime error on Windows and on unknown platforms to prevent silent None and make failures explicit. Also fix the Linux branch to import time (see next comment).
Apply this diff:
@agent_action def open(self, app_or_filename: str): - """Open any application or file with name app_or_filename. This action should be used on Linux/Darwin systems instead of opening the file manually. Do not use on Windows. + """Open any application or file by name. + Supported on Linux and macOS. Not supported on Windows: this action will raise at runtime + to avoid returning None and breaking downstream code. Args: app_or_filename:str, the name of the application or filename to open """ if self.platform == "linux": - return f"import pyautogui; pyautogui.hotkey('win'); time.sleep(0.5); pyautogui.write({repr(app_or_filename)}); time.sleep(1.0); pyautogui.hotkey('enter'); time.sleep(0.5)" + return f"import pyautogui; import time; pyautogui.hotkey('win'); time.sleep(0.5); pyautogui.write({repr(app_or_filename)}); time.sleep(1.0); pyautogui.hotkey('enter'); time.sleep(0.5)" elif self.platform == "darwin": return f"import pyautogui; import time; pyautogui.hotkey('command', 'space', interval=0.5); pyautogui.typewrite({repr(app_or_filename)}); pyautogui.press('enter'); time.sleep(1.0)" + elif self.platform == "windows": + raise RuntimeError("agent.open is not supported on Windows; use switch_applications() or an app-specific action instead.") + else: + raise RuntimeError(f"agent.open is not supported on platform: {self.platform!r}")
407-410: Linux branch uses time.sleep without importing time in the returned code.This will raise NameError at execution time. The macOS branch includes import time; Linux should too.
Apply this minimal fix (also included in the previous diff):
- return f"import pyautogui; pyautogui.hotkey('win'); time.sleep(0.5); pyautogui.write({repr(app_or_filename)}); time.sleep(1.0); pyautogui.hotkey('enter'); time.sleep(0.5)" + return f"import pyautogui; import time; pyautogui.hotkey('win'); time.sleep(0.5); pyautogui.write({repr(app_or_filename)}); time.sleep(1.0); pyautogui.hotkey('enter'); time.sleep(0.5)"
🧹 Nitpick comments (1)
gui_agents/s2_5/agents/grounding.py (1)
403-406: Clarify the docstring and align behavior with the warning.Minor grammar/style tweak and explicitly state the runtime behavior (raising on Windows) to match the code-level guard.
Apply this docstring tweak (also reflected in the first diff):
- """Open any application or file with name app_or_filename. This action should be used on Linux/Darwin systems instead of opening the file manually. Do not use on Windows. + """Open any application or file by name. + Supported on Linux and macOS. Not supported on Windows: this action will raise at runtime + to avoid returning None and breaking downstream code.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
gui_agents/s2_5/agents/grounding.py(1 hunks)
🔇 Additional comments (1)
gui_agents/s2_5/agents/grounding.py (1)
401-411: No remaining Windows open() calls detected—ready to merge.
- Agent-action open() is defined in s2_5, s2, and s1 (WindowsOSACI/MacOSACI) but no code invokes
agent.open()on Windows.- A repo-wide search for
\.open\(only returned Pillow’sImage.openusages—not calls to the agent’sopen.- No
OSWorldACI(platform="windows")instantiations were found.All existing call sites are either strictly image operations or on Linux/macOS platforms.
Fixed grammatical errors in the original prompt and added a warning for Windows systems.
The original one may break the whole system, because the execution results from the
openaction are None, without throwing any exceptions. This results in an unhandled error:exec() arg 1 must be a string, bytes or code objectatAgent-S/gui_agents/s2_5/cli_app.py
Line 128 in 52c0260
This is because
code[0]here will beNone. The new prompt fixes the issue.See more details here: #129 (comment)
Summary by CodeRabbit