Skip to content

Commit 21b0710

Browse files
cephalonautoz-agent
andcommitted
Add verify-ui-change-in-cloud and test-warp-ui bundled skills (dogfood-gated)
Add two new bundled skills gated to dogfood (Local/Dev) builds: - verify-ui-change-in-cloud: spawns a cloud agent with computer use to verify user-facing client changes after they are made - test-warp-ui: guides a cloud agent through launching Warp and testing UI behavior using the computer_use tool Co-Authored-By: Oz <oz-agent@warp.dev>
1 parent 4e600af commit 21b0710

2 files changed

Lines changed: 139 additions & 0 deletions

File tree

  • resources/channel-gated-skills/dogfood
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
---
2+
name: test-warp-ui
3+
description: >
4+
Guides testing Warp UI features and changes using the computer use tool.
5+
Use this skill only when the computer_use tool is available to the agent.
6+
Covers launching Warp and verifying UI behavior.
7+
user-invocable: false
8+
---
9+
10+
# Computer Use for Warp UI Testing
11+
12+
Use the `computer_use` tool to visually test that Warp looks and behaves as intended after UI changes.
13+
14+
## Running Warp
15+
16+
Launch Warp from the repository root with:
17+
18+
```bash
19+
cargo run -- --api-key $STAGING_USER_WARP_API_KEY
20+
```
21+
22+
The `--api-key` flag authenticates using the API key from the `STAGING_USER_WARP_API_KEY` environment variable, so the app starts directly without interactive login prompts.
23+
24+
Initial builds may take several minutes; subsequent incremental builds are faster.
25+
26+
## Testing Workflow
27+
28+
### 1. Hardcode or Mock Data (When Needed)
29+
30+
If you just need to verify that a specific UI looks correct, it can be useful to hardcode or mock data so the UI state is immediately reachable without navigating a full flow. This is optional — skip this step when testing end-to-end flows that should work naturally.
31+
32+
Examples of when to hardcode:
33+
34+
- **Conditional UI**: The feature only appears under certain conditions (e.g., a specific setting, a non-empty data set, an active subscription) — hardcode the condition so the UI always appears.
35+
- **Feature flags**: The feature is behind a flag that isn't enabled yet — enable it directly.
36+
- **Error states**: You want to test error handling UI — hardcode error responses or failure conditions.
37+
38+
Keep mocked changes minimal and focused — only change what's necessary to reach the UI state under test.
39+
40+
### 2. Invoke Computer Use
41+
42+
Call the `computer_use` tool with a task description that includes:
43+
44+
- The command to build and launch Warp (typically `cargo run -- --api-key $STAGING_USER_WARP_API_KEY` from the repo root)
45+
- Step-by-step instructions for navigating to the UI being tested
46+
- **Specific observations to report**: describe exactly what elements, text, colors, layout, or states the tool should observe and describe back
47+
- Do **not** include expected values in the task — the tool should report what it sees, not judge correctness
48+
49+
### 3. Verify Results
50+
51+
Compare the observations returned by `computer_use` against your expectations. If the UI doesn't match, investigate and adjust the code or mocks accordingly.
52+
53+
## Tips
54+
55+
- **Be specific in task descriptions**: Instead of "check if the dialog looks right," say "open Settings, click the General tab, and describe the text and layout of the first section."
56+
- **Test one thing at a time**: Focused tests are easier to debug when observations don't match expectations.
57+
- **Build before invoking**: Always confirm the build succeeds before calling `computer_use`. The tool cannot fix build errors.
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
---
2+
name: verify-ui-change-in-cloud
3+
description: >
4+
Invoke this automatically after completing any user-facing client change, ONLY in non-sandboxed environments and local environments.
5+
Verifies changes by spawning a cloud agent with computer use to test Warp.
6+
Triggers a cloud agent that runs the test-warp-ui skill.
7+
user-invocable: true
8+
---
9+
10+
# Verify UI Change in Cloud
11+
12+
After making a user-facing client change, use this workflow to verify it by spawning a cloud agent with computer use capabilities. This applies to any change that affects what the user sees or experiences in the running app — not just visual/UI changes, but also startup behavior, config handling, migration flows, and other client-side logic.
13+
14+
## Workflow
15+
16+
### 1. Push Your Changes
17+
18+
The cloud agent runs in a fresh environment that clones the repo. Your changes must be pushed to a branch so the cloud agent can access them.
19+
20+
### 2. Detect the Repository
21+
22+
Before spawning the cloud agent, detect which repository you are running in. Check the Git remote URL to determine the repo:
23+
24+
```bash
25+
git remote get-url origin
26+
```
27+
28+
Use the table below to select the correct environment ID:
29+
30+
- **warp** (remote URL contains `warpdotdev/warp`):
31+
- Environment: `SVhg783GBFQHk1OfdPfFU9` (the warp Dev Environment)
32+
33+
If the remote URL does not match, warn the user that this skill only supports the warp repository and stop.
34+
35+
### 3. Spawn the Cloud Agent
36+
37+
Use the `run_agents` tool to spawn a remote cloud agent. A single-child batch (one entry in `agent_run_configs`) is valid.
38+
39+
- `summary`: a brief declarative explanation, e.g. `"Spawning a cloud agent with computer use to verify the UI change."`
40+
- `base_prompt`: include an instruction to read and follow the `test-warp-ui` skill, followed by the verification task (see the next section)
41+
- `remote.environment_id`: the environment ID from the table above
42+
- `remote.computer_use_enabled`: `true`
43+
- `agent_run_configs`: a single entry with `name` set to a short display name such as `"verify-ui-change"`. The per-agent `prompt` can be empty since `base_prompt` covers the task.
44+
45+
The `test-warp-ui` skill is bundled, so the cloud agent has it automatically. Tell the agent to invoke it by name in the `base_prompt` (e.g. "Read and follow the test-warp-ui skill.").
46+
47+
### 4. Write an Effective Prompt
48+
49+
The prompt should tell the cloud agent:
50+
- Which element, flow, or behavior to test
51+
- What hardcoding or mocking is needed (see below and the test-warp-ui skill for details on sandbox constraints)
52+
- What filesystem or app state to pre-seed before launching (e.g., creating directories, writing config files)
53+
- What specific observations to report back
54+
55+
**Example prompts:**
56+
57+
```
58+
I changed the settings dialog header to use a larger font and blue color.
59+
Hardcode the settings dialog to open on launch, then describe the header text,
60+
font size relative to other text, and color.
61+
```
62+
63+
```
64+
I added a migration that symlinks config from ~/.warp into ~/.warp-preview on first launch.
65+
The migration is gated on Channel::Preview. Before building, hardcode the migration to run
66+
regardless of channel by removing the channel check. Also create a fake ~/.warp directory
67+
with test files. After launching Warp, verify the symlinks were created in ~/.warp-preview.
68+
```
69+
70+
### Hardcoding to reach the code path under test
71+
72+
The cloud agent builds Warp with `cargo run`, which may not match the exact runtime conditions of your change (e.g., different channel, missing feature flags, absent preconditions). When this happens, instruct the cloud agent to temporarily hardcode the code so the build exercises the path you need to test. Common examples:
73+
74+
- **Gated code paths**: If the change is behind a channel check, feature flag, or experiment, tell the agent to remove or bypass the gate before building.
75+
- **Pre-existing state**: If the change depends on filesystem state that wouldn't exist in a clean environment (e.g., a config directory from a prior install), tell the agent to create it before launching.
76+
- **Startup behavior**: If the change affects something that only happens on first launch or migration, make sure the agent sets up the preconditions that trigger it.
77+
78+
Be explicit in the prompt about what to hardcode and why — the cloud agent won't infer this on its own.
79+
80+
### 5. Surface the Cloud Agent Link
81+
82+
No extra surfacing step is needed — the Warp client displays the cloud agent run automatically.

0 commit comments

Comments
 (0)