Skip to content

Commit b425213

Browse files
henrypark133claude
andauthored
feat(e2e): extensions tab tests, CI parallelization, and 3 production bug fixes (#584)
* feat(e2e): extensions tab tests, CI parallelization, and 3 bug fixes ## E2E test coverage - Add tests/e2e/scenarios/test_extensions.py with 57 tests covering all extensions tab flows: installed WASM tool/MCP/channel cards, configure modal (open, fields, cancel, save, OAuth, error), auth card (token, OAuth, submit, cancel, error, multi-extension coexistence), activate flow, install/remove flows, WASM channel stepper states, and tab reload behaviour. All network calls intercepted via page.route() — no real binaries or external registries needed. - Expand tests/e2e/helpers.py with 50+ new CSS selectors for the extensions tab UI. - Add tests/e2e/README.md documentation on the page.route() mocking pattern, LIFO handler ordering, and page.evaluate() injection. ## CI parallelization - Split .github/workflows/e2e.yml into a build job (compile once, upload artifact) and a 3-way parallel test matrix (core / features / extensions), matching the pattern in test.yml. Reduces wall-clock time from ~15–20 min serial to ~10–12 min. Adds an e2e roll-up job for branch protection. ## Bug fixes in app.js (found via test-driven code review) - Fix null crash: renderExtensionCard() called ext.tools.length without a null guard; add ext.tools && check (regression: test_ext_tools_null). - Fix modal UX: submitConfigureModal() closed the overlay before checking success, making failures unrecoverable without reopening; close only on success, re-enable buttons and keep modal open on failure (regression: test_configure_modal_stays_open_on_save_failure). - Fix URL injection: all window.open() calls for server-supplied auth_url now go through openOAuthUrl() which rejects non-HTTPS schemes (regression: test_oauth_url_injection_blocked). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(e2e): prune extensions tests 57→46 by merging redundant setups Merge 11 tests that shared identical fixture+navigation overhead: - Group A: 3 empty-state tests → test_extensions_empty_tab_layout - Group B: card_renders absorbs ext_tools_list_shown (same _WASM_TOOL fixture) - Group B: auth_dot_unauthed + unauthed_shows_configure_btn → test_installed_wasm_tool_unauthed_state - Group D: installed + configured states → test_wasm_channel_setup_states (identical UI) - Group D: failed_state + stepper_failed_circle → test_wasm_channel_failed_renders - Group G: 5 field badge tests → test_configure_modal_field_variants (4 fields, one pass) - Group H: submit_success + enter_key_submits → test_auth_card_submit_success Coverage preserved: all assertions kept, no unique behaviors removed. Extensions CI job estimated to drop from ~7 min to ~5 min. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(e2e): fix configure_input selector scoping in merged field variants test modal.locator(".configure-modal input[type='password']") scoped the absolute selector inside .configure-modal, effectively searching for a nested .configure-modal which never exists → count() == 0. Use page.locator() instead, consistent with all other tests in the file. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(e2e): address PR review comments — replace fixed sleeps with deterministic waits - Remove unnecessary wait_for_timeout(1000) from test_remove_cancelled_keeps_card (window.confirm = () => false is synchronous; DOM is unchanged when click() returns) - Replace wait_for_timeout(800) with wait_for_function() for window._lastOpenedUrl checks in configure_modal_save_oauth and activate_with_auth_url_opens_popup - Replace wait_for_timeout(300) with nth(1).wait_for(visible) in test_auth_card_multiple_extensions_coexist - Remove wait_for_timeout(800/300) in test_auth_card_submit_empty_noop and test_auth_completed_sse_dismisses_card (both check synchronous JS side-effects) - Add comment in test_oauth_url_injection_blocked explaining why timeout is kept (negative assertion — cannot use wait_for_function for absence of event) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(e2e): address remaining PR review comments - Remove unused `import pytest` from test_extensions.py - Fix unawaited coroutine bug: convert lambda route handlers to async def in test_extensions_tab_reloads_on_revisit and test_auth_completed_sse_triggers_extensions_reload (lambda r: r.fulfill(...) returns an unawaited coroutine; requests silently fell through to real server) - Fix README.md example to use async def handler (same bug in docs) - Harden openOAuthUrl() in app.js: use URL constructor instead of .startsWith() so non-string server-supplied values (objects, null, etc.) are safely rejected rather than throwing TypeError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(e2e): address second round of PR review comments - Add timeout-minutes to CI build job to prevent hung workflows - Use parsed.href instead of raw url in openOAuthUrl for safety - Remove unused MessageEvent variable in auth_completed test - Replace wait_for_timeout(800) with expect_response in activate test - Replace wait_for_timeout(300) with tab panel wait_for in reload test [skip-regression-check] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 37bba72 commit b425213

5 files changed

Lines changed: 1340 additions & 14 deletions

File tree

.github/workflows/e2e.yml

Lines changed: 55 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,9 @@ on:
99
- "tests/e2e/**"
1010

1111
jobs:
12-
e2e:
13-
name: Browser E2E
12+
# ── Step 1: compile once ──────────────────────────────────────────────────
13+
build:
14+
name: Build ironclaw (libsql)
1415
runs-on: ubuntu-latest
1516
timeout-minutes: 30
1617
steps:
@@ -25,9 +26,44 @@ jobs:
2526
~/.cargo/registry
2627
key: e2e-${{ runner.os }}-${{ hashFiles('Cargo.lock') }}
2728

28-
- name: Build ironclaw (libsql)
29+
- name: Build
2930
run: cargo build --no-default-features --features libsql
3031

32+
- name: Upload binary
33+
uses: actions/upload-artifact@v4
34+
with:
35+
name: ironclaw-e2e-binary
36+
path: target/debug/ironclaw
37+
retention-days: 1
38+
39+
# ── Step 2: run test slices in parallel ───────────────────────────────────
40+
test:
41+
name: E2E (${{ matrix.group }})
42+
needs: build
43+
runs-on: ubuntu-latest
44+
timeout-minutes: 30
45+
strategy:
46+
fail-fast: false
47+
matrix:
48+
include:
49+
- group: core
50+
files: "tests/e2e/scenarios/test_connection.py tests/e2e/scenarios/test_chat.py tests/e2e/scenarios/test_sse_reconnect.py tests/e2e/scenarios/test_html_injection.py"
51+
- group: features
52+
files: "tests/e2e/scenarios/test_skills.py tests/e2e/scenarios/test_tool_approval.py"
53+
- group: extensions
54+
files: "tests/e2e/scenarios/test_extensions.py"
55+
steps:
56+
- uses: actions/checkout@v6
57+
58+
- name: Download binary
59+
uses: actions/download-artifact@v4
60+
with:
61+
name: ironclaw-e2e-binary
62+
path: target/debug/
63+
64+
- name: Make binary executable
65+
run: chmod +x target/debug/ironclaw
66+
3167
- uses: actions/setup-python@v5
3268
with:
3369
python-version: "3.12"
@@ -38,13 +74,26 @@ jobs:
3874
pip install -e .
3975
playwright install --with-deps chromium
4076
41-
- name: Run E2E tests
42-
run: pytest tests/e2e/ -v -x --timeout=120
77+
- name: Run E2E tests (${{ matrix.group }})
78+
run: pytest ${{ matrix.files }} -v --timeout=120
4379

4480
- name: Upload screenshots on failure
4581
if: failure()
4682
uses: actions/upload-artifact@v4
4783
with:
48-
name: e2e-screenshots
84+
name: e2e-screenshots-${{ matrix.group }}
4985
path: tests/e2e/screenshots/
5086
if-no-files-found: ignore
87+
88+
# ── Roll-up for branch protection ────────────────────────────────────────
89+
e2e:
90+
name: E2E Tests
91+
runs-on: ubuntu-latest
92+
if: always()
93+
needs: [test]
94+
steps:
95+
- run: |
96+
if [[ "${{ needs.test.result }}" != "success" ]]; then
97+
echo "One or more E2E jobs failed"
98+
exit 1
99+
fi

src/channels/web/static/app.js

Lines changed: 28 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1003,7 +1003,7 @@ function showAuthCard(data) {
10031003
oauthBtn.className = 'auth-oauth';
10041004
oauthBtn.textContent = 'Authenticate with ' + data.extension_name;
10051005
oauthBtn.addEventListener('click', () => {
1006-
window.open(data.auth_url, '_blank', 'width=600,height=700');
1006+
openOAuthUrl(data.auth_url);
10071007
});
10081008
links.appendChild(oauthBtn);
10091009
}
@@ -1921,7 +1921,7 @@ function renderAvailableExtensionCard(entry) {
19211921
// OAuth popup if auth started during install (builtin creds)
19221922
if (res.auth_url) {
19231923
showToast('Opening authentication for ' + entry.display_name, 'info');
1924-
window.open(res.auth_url, '_blank', 'width=600,height=700');
1924+
openOAuthUrl(res.auth_url);
19251925
}
19261926
loadExtensions();
19271927
// Auto-open configure for WASM channels
@@ -2079,7 +2079,7 @@ function renderExtensionCard(ext) {
20792079
card.appendChild(url);
20802080
}
20812081

2082-
if (ext.tools.length > 0) {
2082+
if (ext.tools && ext.tools.length > 0) {
20832083
const tools = document.createElement('div');
20842084
tools.className = 'ext-tools';
20852085
tools.textContent = 'Tools: ' + ext.tools.join(', ');
@@ -2179,15 +2179,15 @@ function activateExtension(name) {
21792179
// Even on success, the tool may need OAuth (e.g., WASM loaded but no token yet)
21802180
if (res.auth_url) {
21812181
showToast('Opening authentication for ' + name, 'info');
2182-
window.open(res.auth_url, '_blank', 'width=600,height=700');
2182+
openOAuthUrl(res.auth_url);
21832183
}
21842184
loadExtensions();
21852185
return;
21862186
}
21872187

21882188
if (res.auth_url) {
21892189
showToast('Opening authentication for ' + name, 'info');
2190-
window.open(res.auth_url, '_blank');
2190+
openOAuthUrl(res.auth_url);
21912191
} else if (res.awaiting_token) {
21922192
showConfigureModal(name);
21932193
} else {
@@ -2329,20 +2329,21 @@ function submitConfigureModal(name, fields) {
23292329
body: { secrets },
23302330
})
23312331
.then((res) => {
2332-
closeConfigureModal();
23332332
if (res.success) {
2333+
closeConfigureModal();
23342334
if (res.auth_url) {
23352335
// OAuth flow started — open consent popup. The auth_completed SSE will
23362336
// not arrive immediately (it fires after OAuth callback), so show a toast now.
23372337
showToast('Opening OAuth authorization for ' + name, 'info');
2338-
window.open(res.auth_url, '_blank', 'width=600,height=700');
2338+
openOAuthUrl(res.auth_url);
23392339
loadExtensions();
23402340
}
23412341
// For non-OAuth success: the server always broadcasts auth_completed SSE,
23422342
// which will show the toast and refresh extensions — no need to do it here too.
23432343
} else {
2344+
// Keep modal open so the user can correct their input and retry.
2345+
btns.forEach(function(b) { b.disabled = false; });
23442346
showToast(res.message || 'Configuration failed', 'error');
2345-
loadExtensions();
23462347
}
23472348
})
23482349
.catch((err) => {
@@ -2356,6 +2357,25 @@ function closeConfigureModal() {
23562357
if (existing) existing.remove();
23572358
}
23582359

2360+
// Validate that a server-supplied OAuth URL is HTTPS before opening a popup.
2361+
// Rejects javascript:, data:, and other non-HTTPS schemes to prevent URL-injection.
2362+
// Uses the URL constructor to safely parse and validate the scheme, which also
2363+
// handles non-string values (objects, null, etc.) that would throw on .startsWith().
2364+
function openOAuthUrl(url) {
2365+
let parsed;
2366+
try {
2367+
parsed = new URL(url);
2368+
if (parsed.protocol !== 'https:') {
2369+
throw new Error('non-HTTPS protocol: ' + parsed.protocol);
2370+
}
2371+
} catch (e) {
2372+
console.warn('Blocked invalid/non-HTTPS OAuth URL:', url, e.message);
2373+
showToast('Invalid OAuth URL returned by server', 'error');
2374+
return;
2375+
}
2376+
window.open(parsed.href, '_blank', 'width=600,height=700');
2377+
}
2378+
23592379
// --- Pairing ---
23602380

23612381
function loadPairingRequests(channel, container) {

tests/e2e/README.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,10 +52,117 @@ Then Playwright drives a headless Chromium browser against the gateway, making D
5252
| `test_connection.py` | Auth, tab navigation, connection status |
5353
| `test_chat.py` | Send message, SSE streaming, response rendering |
5454
| `test_skills.py` | ClawHub search, skill install/remove |
55+
| `test_tool_approval.py` | Tool approval overlay (approve, deny, always, params toggle) |
56+
| `test_sse_reconnect.py` | SSE reconnection handling |
57+
| `test_html_injection.py` | HTML injection security |
58+
| `test_extensions.py` | Extensions tab: install, remove, configure, OAuth, auth card, activate |
5559

5660
## Adding new scenarios
5761

5862
1. Create `tests/e2e/scenarios/test_<name>.py`
5963
2. Use the `page` fixture for a fresh browser page
6064
3. Use selectors from `helpers.py` (update `SEL` dict if new elements are needed)
6165
4. Keep tests deterministic -- use the mock LLM, not real providers
66+
67+
## Mocking API responses with `page.route()`
68+
69+
For tabs that depend on external data (extensions, jobs, memory, routines), use
70+
Playwright's `page.route()` to intercept the browser's HTTP requests to the
71+
ironclaw gateway and return deterministic fixture JSON. This avoids needing
72+
real installed binaries, live external services, or complex database setup.
73+
74+
### Basic pattern
75+
76+
```python
77+
import json
78+
79+
async def test_something(page):
80+
# 1. Set up route intercepts BEFORE navigation triggers the fetch
81+
# Always use async def handlers — route.fulfill() is a coroutine and must be awaited.
82+
async def handle_tools(route):
83+
await route.fulfill(
84+
status=200,
85+
content_type="application/json",
86+
body=json.dumps({"tools": [{"name": "echo", "description": "Echo"}]}),
87+
)
88+
89+
await page.route("**/api/extensions/tools", handle_tools)
90+
91+
# 2. Navigate / interact to trigger the fetch
92+
await page.locator('.tab-bar button[data-tab="extensions"]').click()
93+
94+
# 3. Assert on the rendered DOM
95+
rows = page.locator("#tools-tbody tr")
96+
assert await rows.count() == 1
97+
```
98+
99+
### Matching only the exact path
100+
101+
`**/api/extensions` matches `http://host/api/extensions` but NOT sub-paths
102+
like `http://host/api/extensions/install`. For the bare list endpoint, add
103+
a check inside the handler:
104+
105+
```python
106+
async def handle_ext_list(route):
107+
path = route.request.url.split("?")[0]
108+
if path.endswith("/api/extensions"):
109+
await route.fulfill(json={"extensions": []})
110+
else:
111+
await route.continue_() # Let sub-paths through to the real server
112+
113+
await page.route("**/api/extensions*", handle_ext_list)
114+
```
115+
116+
### Mocking method-specific behaviour (GET vs POST)
117+
118+
```python
119+
async def handle_setup(route):
120+
if route.request.method == "GET":
121+
await route.fulfill(json={"secrets": [...]})
122+
else: # POST
123+
await route.fulfill(json={"success": True})
124+
125+
await page.route("**/api/extensions/my-ext/setup", handle_setup)
126+
```
127+
128+
### Counting calls (for reload tests)
129+
130+
```python
131+
calls = []
132+
133+
async def counting_handler(route):
134+
calls.append(1)
135+
await route.fulfill(json={"extensions": []})
136+
137+
await page.route("**/api/extensions", counting_handler)
138+
# ... interact ...
139+
assert len(calls) == 2 # called twice (initial + after some action)
140+
```
141+
142+
### Applying the pattern to other tabs
143+
144+
| Tab | Key API endpoints to mock |
145+
|-----|--------------------------|
146+
| **Jobs** | `/api/jobs`, `/api/jobs/{id}`, `/api/jobs/{id}/events` |
147+
| **Memory** | `/api/memory/search`, `/api/memory/tree`, `/api/memory/read` |
148+
| **Routines** | `/api/routines`, `/api/routines/{id}/runs` |
149+
150+
### Injecting state directly via `page.evaluate()`
151+
152+
For purely client-side UI (components rendered entirely in JS without API calls),
153+
call the JavaScript function directly to skip the network layer entirely:
154+
155+
```python
156+
# Show an approval card without needing a real tool execution
157+
await page.evaluate("""
158+
showApproval({
159+
request_id: 'test-001',
160+
thread_id: currentThreadId,
161+
tool_name: 'shell',
162+
description: 'Run something',
163+
})
164+
""")
165+
```
166+
167+
This is the pattern used in `test_tool_approval.py` and parts of
168+
`test_extensions.py` (auth card, configure modal).

tests/e2e/helpers.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,58 @@
4343
"approval_always_btn": ".approval-actions button.always",
4444
"approval_deny_btn": ".approval-actions button.deny",
4545
"approval_resolved": ".approval-resolved",
46+
# Extensions tab – sections
47+
"extensions_list": "#extensions-list",
48+
"available_wasm_list": "#available-wasm-list",
49+
"mcp_servers_list": "#mcp-servers-list",
50+
"tools_tbody": "#tools-tbody",
51+
"tools_empty": "#tools-empty",
52+
# Extensions tab – cards
53+
"ext_card_installed": "#extensions-list .ext-card",
54+
"ext_card_available": "#available-wasm-list .ext-card.ext-available",
55+
"ext_card_mcp": "#mcp-servers-list .ext-card",
56+
"ext_name": ".ext-name",
57+
"ext_kind": ".ext-kind",
58+
"ext_auth_dot": ".ext-auth-dot",
59+
"ext_auth_dot_authed": ".ext-auth-dot.authed",
60+
"ext_auth_dot_unauthed": ".ext-auth-dot.unauthed",
61+
"ext_active_label": ".ext-active-label",
62+
"ext_pairing_label": ".ext-pairing-label",
63+
"ext_error": ".ext-error",
64+
"ext_tools": ".ext-tools",
65+
# Extensions tab – action buttons
66+
"ext_install_btn": ".btn-ext.install",
67+
"ext_remove_btn": ".btn-ext.remove",
68+
"ext_activate_btn": ".btn-ext.activate",
69+
"ext_configure_btn": ".btn-ext.configure",
70+
# Configure modal
71+
"configure_overlay": ".configure-overlay",
72+
"configure_modal": ".configure-modal",
73+
"configure_field": ".configure-field",
74+
"configure_input": ".configure-modal input[type='password']",
75+
"configure_save_btn": ".configure-actions button.btn-ext.activate",
76+
"configure_cancel_btn": ".configure-actions button.btn-ext.remove",
77+
"field_provided": ".field-provided",
78+
"field_autogen": ".field-autogen",
79+
"field_optional": ".field-optional",
80+
# Auth card (SSE-triggered, injected into chat-messages)
81+
"auth_card": ".auth-card",
82+
"auth_header": ".auth-header",
83+
"auth_instructions": ".auth-instructions",
84+
"auth_oauth_btn": ".auth-oauth",
85+
"auth_token_input": ".auth-token-input input",
86+
"auth_submit_btn": ".auth-submit",
87+
"auth_cancel_btn": ".auth-cancel",
88+
"auth_error": ".auth-error",
89+
# WASM channel progress stepper
90+
"ext_stepper": ".ext-stepper",
91+
"stepper_step": ".stepper-step",
92+
"stepper_circle": ".stepper-circle",
93+
# Toast notifications
94+
"toast": ".toast",
95+
"toast_success": ".toast.toast-success",
96+
"toast_error": ".toast.toast-error",
97+
"toast_info": ".toast.toast-info",
4698
}
4799

48100
TABS = ["chat", "memory", "jobs", "routines", "extensions", "skills"]

0 commit comments

Comments
 (0)