|
| 1 | +# WebUI Smoke CLI — installation verification e2e runner |
| 2 | + |
| 3 | +> **Epic**: FR-2871 ([link](https://lablup.atlassian.net/browse/FR-2871)) |
| 4 | +> **Spec Task**: FR-2872 ([link](https://lablup.atlassian.net/browse/FR-2872)) |
| 5 | +> **Source plan**: `.specs/FR-2871-webui-smoke-cli/webui-smoke-cli.md` |
| 6 | +
|
| 7 | +## Overview |
| 8 | + |
| 9 | +A new workspace package, `backend.ai-webui-smoke-cli`, that lets Field-Ops engineers verify Backend.AI WebUI functionality immediately after an on-prem or air-gapped install. The CLI re-uses the existing `e2e/` Playwright assets — selecting only specs tagged `@smoke` — runs them against a customer-provided endpoint, and emits a self-contained HTML + JSON report directory that can be handed off as the install acceptance artifact. |
| 10 | + |
| 11 | +The CLI takes its endpoint and credentials via command-line arguments / environment variables only (no `.env` file editing), auto-detects the logged-in account's role, and operates without contacting the public internet at test-execution time. |
| 12 | + |
| 13 | +## Problem Statement |
| 14 | + |
| 15 | +Field-Ops engineers regularly perform on-prem and air-gapped Backend.AI installs. The current `e2e/` Playwright suite is excellent for CI and developer environments, but it assumes a full developer toolchain: pnpm, the monorepo checkout, Relay compilation, hand-edited `e2e/envs/.env.playwright`, an internet-connected `npx playwright install`, and multiple cross-user accounts (admin + user + user2 + monitor) provisioned for cross-RBAC scenarios. |
| 16 | + |
| 17 | +Customer sites typically provide: |
| 18 | + |
| 19 | +- One endpoint URL (often with a self-signed certificate) |
| 20 | +- One service account (often only admin **or** only user — not both) |
| 21 | +- No outbound internet for the installation host |
| 22 | +- No developer tooling, sometimes not even Node.js |
| 23 | + |
| 24 | +The result today is that post-install verification is done by hand-clicking through the UI, which is slow, inconsistent across engineers, and frequently misses the most common failure mode at customer sites: the app launcher / wsproxy / app-proxy path. We need a single deliverable that a Field-Ops engineer can drop on the install host, run with one command, and hand the resulting report directory to the customer as proof of acceptance. |
| 25 | + |
| 26 | +## Requirements |
| 27 | + |
| 28 | +Functional requirements are grouped by FR-A through FR-J. **FR-A, FR-B, FR-C are MVP.** FR-D through FR-J are Phase 2 and are listed here so this spec remains the single source of truth. |
| 29 | + |
| 30 | +### FR-A — `@smoke` tag convention (MVP) |
| 31 | + |
| 32 | +Introduce a Playwright tag taxonomy and apply it to a starter set of existing e2e specs. |
| 33 | + |
| 34 | +- [ ] Document the tag taxonomy in `e2e/E2E-TEST-NAMING-GUIDELINES.md`: |
| 35 | + - `@smoke` — base smoke set; single account; must complete in 5–10 minutes |
| 36 | + - `@smoke-admin` — admin-only signals (user mgmt, resource policies, agent list) |
| 37 | + - `@smoke-user` — end-user signals (session lifecycle, vfolder, environment list) |
| 38 | + - NOTE: an earlier `@smoke-any` tag was considered for role-agnostic specs, but |
| 39 | + every existing e2e login helper (`loginAsAdmin` / `loginAsUser`) hard-codes a |
| 40 | + named credential. There is no genuinely role-agnostic smoke spec at the helper |
| 41 | + level, so MVP uses only the two role-scoped tags. |
| 42 | +- [ ] Tag at least the following existing specs (no-op functional change): |
| 43 | + - Login / logout / token expiry |
| 44 | + - Dashboard render |
| 45 | + - Session lifecycle (create → running → terminate) |
| 46 | + - VFolder create / upload / download / delete |
| 47 | + - Environment & image list |
| 48 | + - (admin) User list, resource policy list, agent list |
| 49 | + |
| 50 | +Acceptance: |
| 51 | + |
| 52 | +- [ ] `pnpm exec playwright test --grep @smoke --list` (run at the repo root, where `playwright.config.ts` lives) lists at minimum the specs above. |
| 53 | +- [ ] No existing e2e CI job changes behaviour (tags are additive metadata). |
| 54 | + |
| 55 | +### FR-B — `backend.ai-webui-smoke-cli` workspace scaffold (MVP) |
| 56 | + |
| 57 | +Create the new workspace package with read-only subcommands first. |
| 58 | + |
| 59 | +- [ ] New package at `packages/backend.ai-webui-smoke-cli/` registered in the pnpm workspace. |
| 60 | +- [ ] `package.json` exposes a `smoke` bin and a `build` script (tsup or equivalent — implementation choice). |
| 61 | +- [ ] `bai-smoke list` prints the available categories and tags resolved from a catalog file. |
| 62 | +- [ ] `bai-smoke version` prints CLI version, bundled webui SHA placeholder, and Playwright version. |
| 63 | +- [ ] CLI argument parsing layer accepts global options: `--endpoint`, `--webserver`, `--email`, `--password`, `--password-stdin`, `--role`, `--include`, `--exclude`, `--pages`, `--workers`, `--timeout`, `--output`, `--headed`, `--insecure-tls`. |
| 64 | +- [ ] All inputs are accepted as CLI flags **or** as `BAI_SMOKE_*` environment variables. The CLI must **not** require editing any `.env` file. |
| 65 | + |
| 66 | +Acceptance: |
| 67 | + |
| 68 | +- [ ] `pnpm --filter backend.ai-webui-smoke-cli run smoke -- list` prints the catalog. |
| 69 | +- [ ] `pnpm --filter backend.ai-webui-smoke-cli run smoke -- version` prints version info. |
| 70 | +- [ ] `pnpm --filter backend.ai-webui-smoke-cli run smoke -- run --help` documents every option above. |
| 71 | + |
| 72 | +### FR-C — `playwright.smoke.config.ts` + runner (MVP) |
| 73 | + |
| 74 | +Wire the CLI to actually run tagged specs against the supplied endpoint. |
| 75 | + |
| 76 | +- [ ] `playwright.smoke.config.ts` imports the existing repo-root `playwright.config.ts` and overrides: |
| 77 | + - `testDir` to point at the existing `e2e/` directory (the repo-root config already sets `testDir: './e2e'`) |
| 78 | + - `grep` to apply `--include` / `--exclude` tag filters |
| 79 | + - `reporter` to `[['html', { outputFolder: <output>/index, open: 'never' }], ['list']]` |
| 80 | + - `use.baseURL` from `--endpoint` |
| 81 | + - `use.ignoreHTTPSErrors` from `--insecure-tls` |
| 82 | + - `use.video: 'retain-on-failure'`, `use.trace: 'retain-on-failure'` |
| 83 | +- [ ] `runner.ts` programmatically invokes Playwright with the smoke config, injecting credentials via process env (never written to disk). |
| 84 | +- [ ] Single-account authentication: the runner logs in once with the supplied email/password and reuses storage state across specs. |
| 85 | +- [ ] Specs requiring multiple accounts (cross-user RBAC) are excluded from the `@smoke` selection by tag convention. |
| 86 | + |
| 87 | +Acceptance (verifiable in staging): |
| 88 | + |
| 89 | +- [ ] Running `pnpm --filter backend.ai-webui-smoke-cli run smoke -- run --endpoint <staging> --email <email> --password <pw> --output ./report` against a staging Backend.AI cluster: |
| 90 | + - Logs in successfully with the supplied credentials |
| 91 | + - Loads the dashboard |
| 92 | + - Completes at minimum **one session lifecycle** (create batch session → polled to `RUNNING` → terminate cleanly) |
| 93 | + - Writes a Playwright HTML report at `./report/index/index.html` |
| 94 | + - Exits with code `0` on all-pass, non-zero on any failure |
| 95 | +- [ ] No single binary or air-gap bundling is required for MVP — those are FR-G and FR-H. |
| 96 | + |
| 97 | +--- |
| 98 | + |
| 99 | +### FR-D — `preflight` / `doctor` subcommand + role auto-detection (Phase 2) |
| 100 | + |
| 101 | +- [ ] `bai-smoke doctor` checks: endpoint reachability, TLS, login succeeds, browser binary present at expected path, free disk space ≥ configured minimum, and warns on webui-version-vs-CLI-version mismatch by reading the webui SHA from `/manifest.json` or `index.html` meta. |
| 102 | +- [ ] `bai-smoke preflight` (or a `--preflight-only` flag on `run`) performs the doctor checks then exits without running specs. |
| 103 | +- [ ] Role auto-detection: after login, the runner inspects the existing login response / `/server/login-check` signal to classify the account as `admin`, `user`, or `monitor`, and includes only tags compatible with the detected role. |
| 104 | +- [ ] `--role auto` (default) uses detection; explicit `--role admin|user|monitor` overrides. |
| 105 | +- [ ] Detection must not introduce new backend APIs. |
| 106 | + |
| 107 | +Acceptance: |
| 108 | + |
| 109 | +- [ ] Running `doctor` against an unreachable endpoint exits non-zero with a precise reason. |
| 110 | +- [ ] Running `doctor` against a wrong-credential endpoint reports "login failed" distinctly from "endpoint unreachable". |
| 111 | +- [ ] With `--role auto`, an admin login runs `@smoke + @smoke-admin`; a user login runs `@smoke + @smoke-user`. (See FR-A note: the originally-proposed `@smoke-any` tag was dropped because no genuinely role-agnostic smoke spec exists at the helper level.) |
| 112 | + |
| 113 | +### FR-E — Report post-processing (Phase 2) |
| 114 | + |
| 115 | +- [ ] After Playwright finishes, the runner generates the following inside `--output`: |
| 116 | + |
| 117 | + ``` |
| 118 | + smoke-report-<timestamp>/ |
| 119 | + ├── index/ # Playwright html reporter output |
| 120 | + │ └── index.html |
| 121 | + ├── summary.json # machine-readable {total, passed, failed, skipped, durationMs, byCategory: {...}} |
| 122 | + ├── environment.json # {endpoint, role, cliVersion, webuiSha, os, playwrightVersion, chromiumVersion} |
| 123 | + ├── traces/ # failed cases only |
| 124 | + ├── videos/ # failed cases only (passing cases pruned) |
| 125 | + └── logs/ |
| 126 | + ├── console-*.log |
| 127 | + └── network-*.har # failures only |
| 128 | + ``` |
| 129 | +- [ ] Each failed case includes a copy-pasteable diagnostic block: endpoint, role, webui SHA, last 20 console lines, 4xx/5xx network summary. |
| 130 | +- [ ] The runner does **not** write any custom Playwright reporter; post-processing reads the JSON reporter output and augments it. |
| 131 | + |
| 132 | +Acceptance: |
| 133 | + |
| 134 | +- [ ] `summary.json` round-trips through `JSON.parse` and contains all fields above. |
| 135 | +- [ ] After a forced failure, `traces/` and `videos/` for that case are non-empty; for passing cases, `videos/` is empty. |
| 136 | + |
| 137 | +### FR-F — Single-account safe utils (Phase 2) |
| 138 | + |
| 139 | +- [ ] Audit `e2e/utils/test-util.ts` for assumptions requiring multiple seeded accounts (admin, user, user2, monitor). |
| 140 | +- [ ] Extract single-account-safe helpers into a separate module that smoke specs use. |
| 141 | +- [ ] No smoke spec may import a helper that requires more than one account. |
| 142 | + |
| 143 | +Acceptance: |
| 144 | + |
| 145 | +- [ ] Smoke specs run green against a fixture environment provisioned with exactly one account. |
| 146 | + |
| 147 | +### FR-G — Chromium bundling + air-gap runtime + `--insecure-tls` (Phase 2) |
| 148 | + |
| 149 | +- [ ] Build artifact ships a `browsers/` directory with the platform-matched Playwright Chromium. |
| 150 | +- [ ] At runtime, the CLI sets `PLAYWRIGHT_BROWSERS_PATH` to that bundled path before launching Playwright. |
| 151 | +- [ ] `--insecure-tls` flag (default `false`) maps to Playwright `ignoreHTTPSErrors: true`. |
| 152 | +- [ ] All external-origin requests (CDNs, Google Fonts) are blocked via `context.route('**/*', …)` allow-list during spec execution. |
| 153 | +- [ ] No `npx playwright install` is invoked at runtime. |
| 154 | + |
| 155 | +Acceptance: |
| 156 | + |
| 157 | +- [ ] On a host with outbound traffic blocked (verified via firewall rules), the CLI completes a smoke run successfully against an in-network endpoint. |
| 158 | + |
| 159 | +### FR-H — SEA / pkg binary build + internal release workflow (Phase 2) |
| 160 | + |
| 161 | +- [ ] Build scripts produce platform binaries: `bai-smoke-linux-x64`, `bai-smoke-mac-arm64`, `bai-smoke-win-x64.exe`. |
| 162 | +- [ ] Bundled artifact is `.tar.gz` (linux/mac) or `.zip` (win) containing the binary + `browsers/` + `tests/` + `LICENSES.md` + `README.md`. |
| 163 | +- [ ] Release publishes to **internal artifact storage only** (private GitHub Release with restricted access is acceptable). No public release. |
| 164 | +- [ ] Release tag format: `webui-vX.Y.Z+smoke.N`. |
| 165 | + |
| 166 | +Acceptance: |
| 167 | + |
| 168 | +- [ ] An engineer with no Node.js installed can extract the archive on each supported platform and run `./bai-smoke run …` successfully. |
| 169 | + |
| 170 | +### FR-I — Operator README (Phase 2) |
| 171 | + |
| 172 | +- [ ] One-page README in **English and Korean** covering: extract → chmod → run → read report → known issues. |
| 173 | +- [ ] Troubleshooting section covers: self-signed cert, macOS quarantine workaround, Windows SmartScreen workaround, "login failed" vs "endpoint unreachable" disambiguation, leftover resource cleanup (`bai-smoke cleanup --prefix bai-smoke-`). |
| 174 | + |
| 175 | +Acceptance: |
| 176 | + |
| 177 | +- [ ] README fits on one printed page in either language. |
| 178 | + |
| 179 | +### FR-J — Coverage expansion (Phase 2) |
| 180 | + |
| 181 | +Expand the `@smoke` catalog beyond the FR-A starter set. **App launcher is priority 1.** |
| 182 | + |
| 183 | +- [ ] **App launcher (P1)** — start a session, launch one app (jupyter or terminal), confirm the app page actually opens through the app-proxy. This is the most frequent failure mode at customer sites. |
| 184 | +- [ ] Model serving — basic serving endpoint deploy + invocation. |
| 185 | +- [ ] Basic RBAC — single-account-safe permission checks (no cross-user assertions). |
| 186 | + |
| 187 | +Acceptance: |
| 188 | + |
| 189 | +- [ ] App launcher smoke fails loudly when wsproxy / app-proxy is misconfigured (distinguish "session never reached RUNNING" from "session running, app proxy 502"). |
| 190 | + |
| 191 | +## Non-Functional Requirements |
| 192 | + |
| 193 | +All of the following are **mandatory** for the final delivered artifact (Phase 2 completion). MVP relaxes the air-gap and packaging requirements. |
| 194 | + |
| 195 | +- [ ] **Air-gap operation** — at test execution time the CLI must not contact any host other than the supplied webserver / webui endpoint. Verified by running on a host with outbound traffic blocked. |
| 196 | +- [ ] **Single-account assumption** — `@smoke` specs must succeed with exactly one provided account. Cross-user RBAC scenarios are excluded from `@smoke` by convention. |
| 197 | +- [ ] **Role auto-detection** — login response is the source of truth. No new backend API. |
| 198 | +- [ ] **Self-contained report** — the `--output` directory must be portable (zip → email → open). Assets are local; no CDN dependencies in `index.html`. |
| 199 | +- [ ] **Failure-only retention** — traces, videos, HAR captures are retained for failed cases only to keep report size manageable. |
| 200 | +- [ ] **No secrets to disk** — credentials supplied via flag, env, or `--password-stdin` never end up in the report directory or on disk. |
| 201 | + |
| 202 | +## User Stories |
| 203 | + |
| 204 | +- As a Field-Ops engineer, I want to run one command against a freshly installed Backend.AI cluster so that I can confirm the WebUI works without clicking through it by hand. |
| 205 | +- As a Field-Ops engineer working on an air-gapped customer site, I want the CLI to run without internet access so that I do not need to coordinate temporary outbound rules. |
| 206 | +- As a Field-Ops engineer holding only one customer-issued account, I want the CLI to auto-adapt to that account's role so that I do not need to ask the customer for additional admin/user pairs. |
| 207 | +- As a Field-Ops engineer, I want to hand the customer a report directory so that we have a written acceptance artifact for the install. |
| 208 | +- As a Support engineer triaging a failed install, I want each failed case to include a copy-pasteable diagnostic block so that I can open a ticket without re-reproducing locally. |
| 209 | +- As a Backend.AI maintainer, I want every PR that touches the WebUI to run the smoke CLI against the dev cluster so that smoke regressions are caught before customer delivery. |
| 210 | + |
| 211 | +## Acceptance Criteria |
| 212 | + |
| 213 | +### MVP (FR-A + FR-B + FR-C) — verifiable in staging |
| 214 | + |
| 215 | +- [ ] `pnpm --filter backend.ai-webui-smoke-cli run smoke -- run --endpoint <staging-url> --email <email> --password <pw> --output ./report` against staging: |
| 216 | + - Logs in successfully |
| 217 | + - Loads the dashboard |
| 218 | + - Completes one full session lifecycle (create → RUNNING → terminate) |
| 219 | + - Writes a Playwright HTML report at `./report/index/index.html` |
| 220 | + - Exits `0` on all-pass, non-zero on any failure |
| 221 | +- [ ] `bai-smoke list` and `bai-smoke version` work without an endpoint argument. |
| 222 | +- [ ] No CI job in the existing pipeline breaks from the introduction of `@smoke` tags. |
| 223 | + |
| 224 | +### Phase 2 (FR-D … FR-J) |
| 225 | + |
| 226 | +- [ ] A Field-Ops engineer with only the produced archive can complete a smoke run on a fresh, internet-disconnected host on each of: linux-x64, mac-arm64, win-x64. |
| 227 | +- [ ] Report directory `summary.json` and `environment.json` schemas are stable and documented. |
| 228 | +- [ ] App-launcher smoke distinguishes between session-runtime failures and app-proxy failures in the report. |
| 229 | +- [ ] Operator README (KR/EN) is published with the release archive. |
| 230 | + |
| 231 | +## Out of Scope |
| 232 | + |
| 233 | +- Visual regression testing. |
| 234 | +- Backend.AI Manager / Agent / Storage health checks — owned by separate diagnostic tooling. |
| 235 | +- Load and performance testing. |
| 236 | +- SSO / SAML authentication in v1 — deferred. A future `--cookie-file` flag will support pre-issued session cookies; not in this Epic. |
| 237 | +- macOS / Windows code signing — first release ships unsigned with documented quarantine workaround. |
| 238 | +- Public distribution — internal artifact storage only. |
| 239 | +- Custom Playwright reporter — we wrap the bundled HTML reporter and post-process the JSON reporter output instead. |
| 240 | + |
| 241 | +## Risks and Mitigations |
| 242 | + |
| 243 | +| Risk | Mitigation | |
| 244 | +|---|---| |
| 245 | +| SSO / SAML environments block ID+password login | v1 supports basic auth only; cookie-file injection deferred and documented | |
| 246 | +| Customer plugin/theme variants break selectors | `@smoke` specs must prefer `data-testid` and role/text selectors over CSS / XPath | |
| 247 | +| WebUI version vs CLI version mismatch causes false failures | `bai-smoke doctor` reads webui SHA from `/manifest.json` and warns against a compat matrix | |
| 248 | +| Unsigned binary triggers macOS Gatekeeper / Windows SmartScreen | Document `xattr -d com.apple.quarantine` and SmartScreen "Run anyway" path in README; pursue signing in a follow-up | |
| 249 | +| Smoke runs create real sessions/vfolders and pollute customer data | All created resources use a `bai-smoke-<timestamp>-` prefix; `bai-smoke cleanup --prefix bai-smoke-` subcommand handles leftover resources after crashes | |
| 250 | +| Self-signed customer certs block TLS | `--insecure-tls` flag maps to `ignoreHTTPSErrors: true`; off by default | |
| 251 | +| App launcher failures are hard to attribute (session vs proxy) | App-launcher smoke (FR-J) emits distinct failure reasons for "session never RUNNING" vs "session running, app-proxy non-2xx" | |
| 252 | +| Customer outbound is blocked, breaking `npx playwright install` | Chromium is bundled in the release archive (FR-G); CLI sets `PLAYWRIGHT_BROWSERS_PATH` at runtime | |
| 253 | + |
| 254 | +## Related Issues |
| 255 | + |
| 256 | +- FR-2871 — Epic: WebUI Smoke CLI — installation verification e2e runner |
| 257 | +- FR-2872 — Spec definition task (this work) |
| 258 | +- `docs/plans/webui-smoke-cli.md` — original approved plan |
0 commit comments