Epic: FR-2871 (link) Spec Task: FR-2872 (link) Source plan:
.specs/FR-2871-webui-smoke-cli/webui-smoke-cli.md
A new workspace package, backend.ai-webui-smoke-cli, that lets Field-Ops engineers verify Backend.AI WebUI functionality immediately after an on-prem or air-gapped install. The CLI re-uses the existing e2e/ Playwright assets — selecting only specs tagged @smoke — runs them against a customer-provided endpoint, and emits a self-contained HTML + JSON report directory that can be handed off as the install acceptance artifact.
The CLI takes its endpoint and credentials via command-line arguments / environment variables only (no .env file editing), auto-detects the logged-in account's role, and operates without contacting the public internet at test-execution time.
Field-Ops engineers regularly perform on-prem and air-gapped Backend.AI installs. The current e2e/ Playwright suite is excellent for CI and developer environments, but it assumes a full developer toolchain: pnpm, the monorepo checkout, Relay compilation, hand-edited e2e/envs/.env.playwright, an internet-connected npx playwright install, and multiple cross-user accounts (admin + user + user2 + monitor) provisioned for cross-RBAC scenarios.
Customer sites typically provide:
- One endpoint URL (often with a self-signed certificate)
- One service account (often only admin or only user — not both)
- No outbound internet for the installation host
- No developer tooling, sometimes not even Node.js
The result today is that post-install verification is done by hand-clicking through the UI, which is slow, inconsistent across engineers, and frequently misses the most common failure mode at customer sites: the app launcher / wsproxy / app-proxy path. We need a single deliverable that a Field-Ops engineer can drop on the install host, run with one command, and hand the resulting report directory to the customer as proof of acceptance.
Functional requirements are grouped by FR-A through FR-J. FR-A, FR-B, FR-C are MVP. FR-D through FR-J are Phase 2 and are listed here so this spec remains the single source of truth.
Introduce a Playwright tag taxonomy and apply it to a starter set of existing e2e specs.
- Document the tag taxonomy in
e2e/E2E-TEST-NAMING-GUIDELINES.md:@smoke— base smoke set; single account; must complete in 5–10 minutes@smoke-admin— admin-only signals (user mgmt, resource policies, agent list)@smoke-user— end-user signals (session lifecycle, vfolder, environment list)- NOTE: an earlier
@smoke-anytag was considered for role-agnostic specs, but every existing e2e login helper (loginAsAdmin/loginAsUser) hard-codes a named credential. There is no genuinely role-agnostic smoke spec at the helper level, so MVP uses only the two role-scoped tags.
- Tag at least the following existing specs (no-op functional change):
- Login / logout / token expiry
- Dashboard render
- Session lifecycle (create → running → terminate)
- VFolder create / upload / download / delete
- Environment & image list
- (admin) User list, resource policy list, agent list
Acceptance:
-
pnpm exec playwright test --grep @smoke --list(run at the repo root, whereplaywright.config.tslives) lists at minimum the specs above. - No existing e2e CI job changes behaviour (tags are additive metadata).
Create the new workspace package with read-only subcommands first.
- New package at
packages/backend.ai-webui-smoke-cli/registered in the pnpm workspace. -
package.jsonexposes asmokebin and abuildscript (tsup or equivalent — implementation choice). -
bai-smoke listprints the available categories and tags resolved from a catalog file. -
bai-smoke versionprints CLI version, bundled webui SHA placeholder, and Playwright version. - CLI argument parsing layer accepts global options:
--endpoint,--webserver,--email,--password,--password-stdin,--role,--include,--exclude,--pages,--workers,--timeout,--output,--headed,--insecure-tls. - All inputs are accepted as CLI flags or as
BAI_SMOKE_*environment variables. The CLI must not require editing any.envfile.
Acceptance:
-
pnpm --filter backend.ai-webui-smoke-cli run smoke -- listprints the catalog. -
pnpm --filter backend.ai-webui-smoke-cli run smoke -- versionprints version info. -
pnpm --filter backend.ai-webui-smoke-cli run smoke -- run --helpdocuments every option above.
Wire the CLI to actually run tagged specs against the supplied endpoint.
-
playwright.smoke.config.tsimports the existing repo-rootplaywright.config.tsand overrides:testDirto point at the existinge2e/directory (the repo-root config already setstestDir: './e2e')grepto apply--include/--excludetag filtersreporterto[['html', { outputFolder: <output>/index, open: 'never' }], ['list']]use.baseURLfrom--endpointuse.ignoreHTTPSErrorsfrom--insecure-tlsuse.video: 'retain-on-failure',use.trace: 'retain-on-failure'
-
runner.tsprogrammatically invokes Playwright with the smoke config, injecting credentials via process env (never written to disk). - Single-account authentication: the runner logs in once with the supplied email/password and reuses storage state across specs.
- Specs requiring multiple accounts (cross-user RBAC) are excluded from the
@smokeselection by tag convention.
Acceptance (verifiable in staging):
- Running
pnpm --filter backend.ai-webui-smoke-cli run smoke -- run --endpoint <staging> --email <email> --password <pw> --output ./reportagainst a staging Backend.AI cluster:- Logs in successfully with the supplied credentials
- Loads the dashboard
- Completes at minimum one session lifecycle (create batch session → polled to
RUNNING→ terminate cleanly) - Writes a Playwright HTML report at
./report/index/index.html - Exits with code
0on all-pass, non-zero on any failure
- No single binary or air-gap bundling is required for MVP — those are FR-G and FR-H.
-
bai-smoke doctorchecks: endpoint reachability, TLS, login succeeds, browser binary present at expected path, free disk space ≥ configured minimum, and warns on webui-version-vs-CLI-version mismatch by reading the webui SHA from/manifest.jsonorindex.htmlmeta. -
bai-smoke preflight(or a--preflight-onlyflag onrun) performs the doctor checks then exits without running specs. - Role auto-detection: after login, the runner inspects the existing login response /
/server/login-checksignal to classify the account asadmin,user, ormonitor, and includes only tags compatible with the detected role. -
--role auto(default) uses detection; explicit--role admin|user|monitoroverrides. - Detection must not introduce new backend APIs.
Acceptance:
- Running
doctoragainst an unreachable endpoint exits non-zero with a precise reason. - Running
doctoragainst a wrong-credential endpoint reports "login failed" distinctly from "endpoint unreachable". - With
--role auto, an admin login runs@smoke + @smoke-admin; a user login runs@smoke + @smoke-user. (See FR-A note: the originally-proposed@smoke-anytag was dropped because no genuinely role-agnostic smoke spec exists at the helper level.)
-
After Playwright finishes, the runner generates the following inside
--output:smoke-report-<timestamp>/ ├── index/ # Playwright html reporter output │ └── index.html ├── summary.json # machine-readable {total, passed, failed, skipped, durationMs, byCategory: {...}} ├── environment.json # {endpoint, role, cliVersion, webuiSha, os, playwrightVersion, chromiumVersion} ├── traces/ # failed cases only ├── videos/ # failed cases only (passing cases pruned) └── logs/ ├── console-*.log └── network-*.har # failures only -
Each failed case includes a copy-pasteable diagnostic block: endpoint, role, webui SHA, last 20 console lines, 4xx/5xx network summary.
-
The runner does not write any custom Playwright reporter; post-processing reads the JSON reporter output and augments it.
Acceptance:
-
summary.jsonround-trips throughJSON.parseand contains all fields above. - After a forced failure,
traces/andvideos/for that case are non-empty; for passing cases,videos/is empty.
- Audit
e2e/utils/test-util.tsfor assumptions requiring multiple seeded accounts (admin, user, user2, monitor). - Extract single-account-safe helpers into a separate module that smoke specs use.
- No smoke spec may import a helper that requires more than one account.
Acceptance:
- Smoke specs run green against a fixture environment provisioned with exactly one account.
- Build artifact ships a
browsers/directory with the platform-matched Playwright Chromium. - At runtime, the CLI sets
PLAYWRIGHT_BROWSERS_PATHto that bundled path before launching Playwright. -
--insecure-tlsflag (defaultfalse) maps to PlaywrightignoreHTTPSErrors: true. - All external-origin requests (CDNs, Google Fonts) are blocked via
context.route('**/*', …)allow-list during spec execution. - No
npx playwright installis invoked at runtime.
Acceptance:
- On a host with outbound traffic blocked (verified via firewall rules), the CLI completes a smoke run successfully against an in-network endpoint.
- Build scripts produce platform binaries:
bai-smoke-linux-x64,bai-smoke-mac-arm64,bai-smoke-win-x64.exe. - Bundled artifact is
.tar.gz(linux/mac) or.zip(win) containing the binary +browsers/+tests/+LICENSES.md+README.md. - Release publishes to internal artifact storage only (private GitHub Release with restricted access is acceptable). No public release.
- Release tag format:
webui-vX.Y.Z+smoke.N.
Acceptance:
- An engineer with no Node.js installed can extract the archive on each supported platform and run
./bai-smoke run …successfully.
- One-page README in English and Korean covering: extract → chmod → run → read report → known issues.
- Troubleshooting section covers: self-signed cert, macOS quarantine workaround, Windows SmartScreen workaround, "login failed" vs "endpoint unreachable" disambiguation, leftover resource cleanup (
bai-smoke cleanup --prefix bai-smoke-).
Acceptance:
- README fits on one printed page in either language.
Expand the @smoke catalog beyond the FR-A starter set. App launcher is priority 1.
- App launcher (P1) — start a session, launch one app (jupyter or terminal), confirm the app page actually opens through the app-proxy. This is the most frequent failure mode at customer sites.
- Model serving — basic serving endpoint deploy + invocation.
- Basic RBAC — single-account-safe permission checks (no cross-user assertions).
Acceptance:
- App launcher smoke fails loudly when wsproxy / app-proxy is misconfigured (distinguish "session never reached RUNNING" from "session running, app proxy 502").
All of the following are mandatory for the final delivered artifact (Phase 2 completion). MVP relaxes the air-gap and packaging requirements.
- Air-gap operation — at test execution time the CLI must not contact any host other than the supplied webserver / webui endpoint. Verified by running on a host with outbound traffic blocked.
- Single-account assumption —
@smokespecs must succeed with exactly one provided account. Cross-user RBAC scenarios are excluded from@smokeby convention. - Role auto-detection — login response is the source of truth. No new backend API.
- Self-contained report — the
--outputdirectory must be portable (zip → email → open). Assets are local; no CDN dependencies inindex.html. - Failure-only retention — traces, videos, HAR captures are retained for failed cases only to keep report size manageable.
- No secrets to disk — credentials supplied via flag, env, or
--password-stdinnever end up in the report directory or on disk.
- As a Field-Ops engineer, I want to run one command against a freshly installed Backend.AI cluster so that I can confirm the WebUI works without clicking through it by hand.
- As a Field-Ops engineer working on an air-gapped customer site, I want the CLI to run without internet access so that I do not need to coordinate temporary outbound rules.
- As a Field-Ops engineer holding only one customer-issued account, I want the CLI to auto-adapt to that account's role so that I do not need to ask the customer for additional admin/user pairs.
- As a Field-Ops engineer, I want to hand the customer a report directory so that we have a written acceptance artifact for the install.
- As a Support engineer triaging a failed install, I want each failed case to include a copy-pasteable diagnostic block so that I can open a ticket without re-reproducing locally.
- As a Backend.AI maintainer, I want every PR that touches the WebUI to run the smoke CLI against the dev cluster so that smoke regressions are caught before customer delivery.
-
pnpm --filter backend.ai-webui-smoke-cli run smoke -- run --endpoint <staging-url> --email <email> --password <pw> --output ./reportagainst staging:- Logs in successfully
- Loads the dashboard
- Completes one full session lifecycle (create → RUNNING → terminate)
- Writes a Playwright HTML report at
./report/index/index.html - Exits
0on all-pass, non-zero on any failure
-
bai-smoke listandbai-smoke versionwork without an endpoint argument. - No CI job in the existing pipeline breaks from the introduction of
@smoketags.
- A Field-Ops engineer with only the produced archive can complete a smoke run on a fresh, internet-disconnected host on each of: linux-x64, mac-arm64, win-x64.
- Report directory
summary.jsonandenvironment.jsonschemas are stable and documented. - App-launcher smoke distinguishes between session-runtime failures and app-proxy failures in the report.
- Operator README (KR/EN) is published with the release archive.
- Visual regression testing.
- Backend.AI Manager / Agent / Storage health checks — owned by separate diagnostic tooling.
- Load and performance testing.
- SSO / SAML authentication in v1 — deferred. A future
--cookie-fileflag will support pre-issued session cookies; not in this Epic. - macOS / Windows code signing — first release ships unsigned with documented quarantine workaround.
- Public distribution — internal artifact storage only.
- Custom Playwright reporter — we wrap the bundled HTML reporter and post-process the JSON reporter output instead.
| Risk | Mitigation |
|---|---|
| SSO / SAML environments block ID+password login | v1 supports basic auth only; cookie-file injection deferred and documented |
| Customer plugin/theme variants break selectors | @smoke specs must prefer data-testid and role/text selectors over CSS / XPath |
| WebUI version vs CLI version mismatch causes false failures | bai-smoke doctor reads webui SHA from /manifest.json and warns against a compat matrix |
| Unsigned binary triggers macOS Gatekeeper / Windows SmartScreen | Document xattr -d com.apple.quarantine and SmartScreen "Run anyway" path in README; pursue signing in a follow-up |
| Smoke runs create real sessions/vfolders and pollute customer data | All created resources use a bai-smoke-<timestamp>- prefix; bai-smoke cleanup --prefix bai-smoke- subcommand handles leftover resources after crashes |
| Self-signed customer certs block TLS | --insecure-tls flag maps to ignoreHTTPSErrors: true; off by default |
| App launcher failures are hard to attribute (session vs proxy) | App-launcher smoke (FR-J) emits distinct failure reasons for "session never RUNNING" vs "session running, app-proxy non-2xx" |
Customer outbound is blocked, breaking npx playwright install |
Chromium is bundled in the release archive (FR-G); CLI sets PLAYWRIGHT_BROWSERS_PATH at runtime |
- FR-2871 — Epic: WebUI Smoke CLI — installation verification e2e runner
- FR-2872 — Spec definition task (this work)
docs/plans/webui-smoke-cli.md— original approved plan