Name	Name	Last commit message	Last commit date
parent directory ..
docs	docs
fixtures	fixtures
scripts	scripts
specs	specs
.gitignore	.gitignore
README.md	README.md
capture-console.mjs	capture-console.mjs
package-lock.json	package-lock.json
package.json	package.json
playwright.config.ts	playwright.config.ts
tsconfig.json	tsconfig.json

End-to-End Smoke Tests

Real-browser Playwright tests that drive the frontend against a locally running backend. Intended to catch cross-layer regressions (SSE timing, React lifecycle, bus round-trips) that unit and component tests can't.

Quick start

Prereqs: the dev stack is up — frontend + backend containers running against a local KB, reachable by IP. Full rebuild/start flow in docs/containers.md.

container ls | grep -E 'semiont-(frontend|backend)'    # grab both IPs

container run --rm \
  -v "$(git rev-parse --show-toplevel):/workspace" \
  -w /workspace/tests/e2e \
  -e E2E_EMAIL=admin@example.com \
  -e E2E_PASSWORD=password \
  -e E2E_FRONTEND_URL=http://<frontend-ip>:3000 \
  -e E2E_BACKEND_URL=http://<backend-ip>:4000 \
  -e CI=1 \
  mcr.microsoft.com/playwright:v1.61.0-noble \
  npx playwright test

If every test fails in the signIn fixture with "Request failed due to a network error", the Playwright container can't reach the host-published backend — see Container networking.

Container networking: reaching the host

The suite runs in a Playwright container, but the frontend and backend are published on the host. A containerized browser can't use localhost — inside the container that resolves to the container itself, not the host. And pinning a container's bridge IP is fragile: container IPs change on every restart.

The robust target is the host bridge gateway, 192.168.64.1: it's reachable from inside containers, routes to the host's published ports (:3000→frontend, :4000→backend), and its address is stable across restarts.

No CORS origin to configure. The backend serves open CORS (Access-Control-Allow-Origin: *, bearer-only — no credentials), so the browser signs in from any origin. This removed an earlier corsOrigin-baked-into-the-image workaround; if you're following older notes that tell you to set services.backend.corsOrigin and rebuild, that config field no longer exists.

Run the suite against the gateway for both URLs, with the frontend published on host port 3000 (-p 3000:3000; the backend already publishes 4000). No IP-grabbing needed — the gateway doesn't change between runs:

container run --rm \
  -v "$(git rev-parse --show-toplevel):/workspace" \
  -w /workspace/tests/e2e \
  -e E2E_EMAIL=admin@example.com \
  -e E2E_PASSWORD=password \
  -e E2E_FRONTEND_URL=http://192.168.64.1:3000 \
  -e E2E_BACKEND_URL=http://192.168.64.1:4000 \
  -e CI=1 \
  mcr.microsoft.com/playwright:v1.61.0-noble \
  npx playwright test

Docs

Running tests — invocation, single spec, headed, --repeat-each, host vs. container.
Containers and rebuild flow — Apple container CLI, Verdaccio, rebuilding backend/frontend after code changes, IP refresh, Playwright image tag.
Writing tests — spec template, fixture ordering, protocol-level assertions, seed assumptions, selector conventions.
Debugging failures — traces, report UI, JSONL extraction, diagnostic specs, backend-log tailing, instrument don't speculate.
Bus logging — the __SEMIONT_BUS_LOG__ wire logger, the bus capture fixture, assertion helpers.
Jaeger evidence — the jaeger fixture that pulls matching distributed traces on test teardown and attaches them to the Playwright report.
Page errors — the pageErrors fixture that surfaces uncaught browser-side errors (exceptions, unhandled rejections, console.error) — invisible to bus/jaeger captures. Soft by default; flip PAGE_ERRORS_FAIL=1 once clean.
Live monitoring — sibling workflow for bug-hunting on the running stack (no Playwright). Streaming per-container error tails + on-demand snapshot of the last N seconds across logs and Jaeger spans. How "live monitoring caught X" turns into "e2e spec Y".
Known gotchas — sharp edges that took real debugging the first time: crypto.randomUUID, form-field ordering, stale tabs, fixture ordering, etc.

Current tests

Each targets a path that has broken before. A regression in any of them fails the corresponding test.

01-sign-in.spec.ts — sign-in succeeds, lands on the knowledge section.
02-open-resource.spec.ts — open a resource from Discover, content loads.
03-navigate-resources.spec.ts — click between two open-resource sidebar tabs, content actually updates.
04-manual-highlight.spec.ts — select text with motivation=highlight, confirm the highlight is persisted and survives reload.
05-manual-reference.spec.ts — select text with motivation=linking and an entity-type chip, confirm the reference is persisted and survives reload.
06-assisted-reference.spec.ts — click the assist widget's "Annotate" button with entity types selected, confirm the assist dispatch crosses the wire.
07-sign-out-sign-in.spec.ts — sign out, sign back in, confirm the session state rebuilds and bus round-trips still work on the fresh client.
08-hover-beckon.spec.ts — hover over an annotation, confirm the BeckonStateUnit focus/sparkle signal flows. Auto-skips if the fixture resource has no annotations (the template KB starts empty; tests 04 and 05 create annotations when they run).
99-diagnose-entity-types.spec.ts — instance-tracking diagnostic for the entity-types flow (ActorStateUnit / BrowseNamespace construction counts + cache delivery). Not a regression guard — a running dashboard for the singleton-ness invariants the SSE reconnect logic depends on.

Non-goals (for now)

Not wired into CI. Run locally against a manually-brought-up stack.
Not seeding fixtures. Assumes the target KB has ≥2 resources and ≥1 entity type — true of the default template KB.
Not testing real OAuth. Credentials sign-in only.
Not parallel. Single worker until fixtures are per-test-isolated.
Not cross-browser. Chromium only.

Running against a freshly-built stack

The e2e harness assumes containers are already up. To bring up a stack that exactly matches the current branch's source:

# 1. Build all @semiont/* packages, publish to local Verdaccio,
#    build the semiont-frontend image.
./scripts/ci/local-build.sh

# 2. From the KB project (typically ../semiont-template-kb), bring up
#    backend / worker / smelter against the local Verdaccio. The
#    --config anthropic flag avoids host-Ollama networking issues
#    (see "Gotchas" below).
cd ../semiont-template-kb
ANTHROPIC_API_KEY="$(op read op://OSS/Anthropic/credential)" \
  NPM_REGISTRY=http://192.168.64.1:4873 \
  .semiont/scripts/start.sh --observe --no-cache --config anthropic \
  --email admin@example.com --password password

# 3. Run the frontend container (separate — start.sh manages backend
#    services only).
container run -d --name semiont-frontend-e2e -p 3000:3000 semiont-frontend

# 4. Grab IPs and run the e2e suite (see Quick start above).
container ls | grep -E 'semiont-(frontend-e2e|backend)'

Use --observe on start.sh to pull in a Jaeger sidecar and wire OTEL_EXPORTER_OTLP_ENDPOINT for backend / worker / smelter — useful for inspecting cross-service traces while debugging an e2e failure. Jaeger UI lands on http://localhost:16686.

Gotchas

Apple Container --rm is unreliable. Stopped semiont-* containers often linger and conflict on next start with Error: container with id semiont-foo already exists. Wipe with container stop $name && container rm $name before retrying.
Host Ollama needs OLLAMA_HOST=0.0.0.0. Otherwise the backend container can't reach it. Either configure Ollama Desktop with launchctl setenv OLLAMA_HOST 0.0.0.0 (and quit/relaunch), or use start.sh --config anthropic to skip Ollama entirely.
Code changes require backend image rebuild. start.sh --no-cache forces npm install @semiont/backend@latest to re-resolve deps from Verdaccio. Without it, you'll run yesterday's image with today's frontend.
SPA tracing is not currently wired. Backend / worker / smelter produce traces; the frontend SPA does not. End-to-end traces therefore start at bus.dispatch:* (server-side EMIT receive) rather than the SPA's bus.emit:*. To enable SPA tracing in a future iteration, you'd need VITE_OTEL_OTLP_ENDPOINT threaded through local-build.sh into the vite build container, plus COLLECTOR_OTLP_HTTP_CORS_ALLOWED_ORIGINS=* on the Jaeger sidecar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

README.md

End-to-End Smoke Tests

Quick start

Container networking: reaching the host

Docs

Current tests

Non-goals (for now)

Running against a freshly-built stack

Gotchas

Uh oh!

FilesExpand file tree

e2e

Directory actions

More options

Directory actions

More options

Latest commit

History

e2e

Folders and files

parent directory

README.md

End-to-End Smoke Tests

Quick start

Container networking: reaching the host

Docs

Current tests

Non-goals (for now)

Running against a freshly-built stack

Gotchas