-
Notifications
You must be signed in to change notification settings - Fork 174
[Enhancement] OpenApp ScreenShot feature #333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AlirezaShamsoshoara
wants to merge
6
commits into
meta-pytorch:main
Choose a base branch
from
AlirezaShamsoshoara:ali/bug_fix/openapp_env_01
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[Enhancement] OpenApp ScreenShot feature #333
AlirezaShamsoshoara
wants to merge
6
commits into
meta-pytorch:main
from
AlirezaShamsoshoara:ali/bug_fix/openapp_env_01
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Greptile OverviewGreptile SummaryThis PR adds screenshot capture functionality to the OpenApp environment and fixes critical dependency issues. Key Changes:
Alignment with OpenEnv Principles:
Technical Quality:
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Agent
participant OpenAppEnv
participant BrowserGym
participant Playwright
participant OpenApps
Note over Agent,OpenApps: Screenshot Capture Flow
Agent->>OpenAppEnv: reset()
OpenAppEnv->>BrowserGym: reset()
BrowserGym->>Playwright: navigate to OpenApps URL
Playwright->>OpenApps: GET /
OpenApps-->>Playwright: HTML response
Playwright-->>BrowserGym: screenshot (numpy array)
BrowserGym-->>OpenAppEnv: obs with screenshot
OpenAppEnv->>OpenAppEnv: _extract_screenshot(obs)
Note over OpenAppEnv: Convert numpy array → PIL Image → PNG → base64
OpenAppEnv-->>Agent: OpenAppObservation with screenshot
Agent->>OpenAppEnv: step(action)
alt BrowserGym Action (bid-based)
OpenAppEnv->>BrowserGym: step(action_string)
BrowserGym->>Playwright: execute action
Playwright->>OpenApps: interact with page
OpenApps-->>Playwright: updated page
Playwright-->>BrowserGym: screenshot (numpy array)
BrowserGym-->>OpenAppEnv: obs with screenshot
OpenAppEnv->>OpenAppEnv: _extract_screenshot(obs)
else Playwright Direct Action (CSS selector)
OpenAppEnv->>Playwright: click/fill(selector)
Playwright->>OpenApps: interact with page
OpenApps-->>Playwright: updated page
OpenAppEnv->>Playwright: screenshot()
Playwright-->>OpenAppEnv: screenshot (bytes)
OpenAppEnv->>OpenAppEnv: base64 encode
end
OpenAppEnv-->>Agent: OpenAppObservation with screenshot
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Meta Open Source bot.
documentation
Improvements or additions to documentation
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds screenshot capture functionality to the OpenApp environment and fixes dependency issues that were causing import errors.
Key Changes
reset()andstep())beartypeversion conflicts and added missingfastmcpdependency to DockerfileType of Change
Changes
Screenshot Implementation (
envs/openapp_env/server/openapp_environment.py)_current_screenshotinstance variable to track screenshot state_extract_screenshot()helper method to convert BrowserGym numpy arrays to base64 PNGreset()andstep()to extract and return screenshots_execute_*methods (click, fill, goto, scroll, send_keys) to capture screenshots_update_observation_from_page()to capture screenshots from PlaywrightDockerfile Fixes (
envs/openapp_env/server/Dockerfile)fastmcpdependency (required by openenv-core for MCP support)>=0.15,<0.18for py-key-value-aio compatibilityDependencies (
envs/openapp_env/pyproject.toml)Pillow>=10.0.0for screenshot image processingExample Script (
examples/openapp_example.py)--test-screenshotsflag to verify screenshot functionalityexamples/screenshot_output/with descriptive namesDocumentation (
envs/openapp_env/README.md)beartype_this_packageimport errorsPYTHONNOUSERSITEfix for conda/virtualenv usersOther
pyproject.tomlfor local developmentAlignment Checklist
Before submitting, verify:
.claude/docs/PRINCIPLES.mdand this PR aligns with our principles.claude/docs/INVARIANTS.mdand no invariants are violated/pre-submit-pr(orbash .claude/hooks/lint.shand tests) and addressed all issuesRFC Status
Test Plan
Local Mode
Docker Mode
Expected Output
examples/screenshot_output/local_reset.png,local_step_1_goto.png, etc.Claude Code Review
Alignment Review Report from Claude Code
Automated Checks
Open RFCs Context
None of these RFCs directly conflict with the screenshot feature addition.
Tier 1: Fixes Required
None identified. All mechanical issues are clean:
Tier 2: Alignment Discussion
Principle Conflicts
None identified. The changes align with OpenEnv principles:
RFC Conflicts
None identified. The changes don't conflict with any open RFCs:
Invariant Check
All invariants preserved:
Summary
Recommendation: This PR is ready for merge. The changes are additive (populating an existing field), fix real dependency issues, and include good documentation.