tests: smoke specs for fixmystreet, planx, minute (Phase 4 top-3) by chrisns · Pull Request #230 · co-cddo/ndx_try_aws_scenarios

chrisns · 2026-05-12T09:59:31Z

Summary

Phase 4 of the scenario-regression smoke-pack tech-spec — initial slice of three specs in priority order. The remaining 14 land as follow-up PRs (one per scenario) once Phase 1b provides a smoke account to iterate against.

Ships:

`tests/smoke/fixmystreet.spec.ts`: post-login feature flow visits `/reports` (catches the `bin/update-all-reports` regression) and `/admin` (catches the `must_have_2fa` regression). Landing-assertion also greps for `:9000` URLs (catches the ALB sidecar mis-routing regression).
`tests/smoke/planx.spec.ts`: post-login feature flow asserts the SPA boot is free of the domain-allowlist error (catches `window.location.host` regression) and the Airbrake overlay (`VITE_APP_ENV=production` regression). Hits Hasura native `/v1/version` (catches the Caddy-elimination regression).
`tests/smoke/minute.spec.ts`: uses Playwright browser context `httpCredentials` to attach basic auth as headers (NOT URL-embedded — that's the regression). In-SPA `fetch()` against `/health` and `/api/proxy/healthcheck` cover the CloudFront-embedded-basic-auth regression and the ALB `/api/*` interception regression respectively.
`tests/smoke/fixtures/assertion-bar.ts`: three rows populated with explicit citations to the memory files informing each featureFlow.

Deferred:

14 remaining specs (localgov-ims, localgov-drupal, simply-readable, ai-contact-centre, paperless-ngx, council-chatbot, bops-planning, digital-planning-register, quicksight-dashboard, foi-redaction, planning-ai, text-to-speech, smart-car-park, all-demo) — follow-up PRs after Phase 1b. The Phase 4 DoD requires 14/17 active or ≤3 quarantined; tracked separately.

Depends on: Phase 3 (PR #229) for the rails. Independently mergeable: the smoke workflow self-disables on placeholder config in `docs/smoke-test-account-config.yml`, so this PR adding 3 specs cannot break anything in CI until Phase 1b populates the config.

Test plan

CI: `npx playwright test --list --project=smoke` lists 3 tests
Local: a developer with the smoke account credentials can run `SMOKE_STACK_NAME=all-demo SMOKE_AWS_PROFILE=NDX/SmokeTest ./scripts/smoke.sh --grep fixmystreet` against a deployed all-demo and the spec passes (POST-Phase-1b)
Same for planx and minute
Operator: trace artefact for a deliberately-failed login does NOT contain the cleartext password (sensitive-redaction contract verification)

…2a partial) Phase 2a of the scenario-regression smoke-pack tech-spec. Two new synth jobs in deploy-blueprints.yml plus a packaged-CodeUri verification step for the existing ai-contact-centre SAM packaging. What ships: - synth-planx: cdk synth, strip bootstrap, write template.yaml, upload artifact. Stack name PlanxStack. Template currently produces cleanly under the existing strict synth lint (all RemovalPolicies are DESTROY). - synth-digital-planning-register: same pattern. Stack name DigitalPlanningRegisterStack. The LogGroup in compute.ts switched from RemovalPolicy.RETAIN to DESTROY; there was no comment justifying RETAIN and the smoke pack wants clean teardown between runs. - New downloads + S3 upload step in the deploy job: planx, digital-planning-register, and the existing-committed bops-planning template all land at s3://ndx-try-isb-blueprints-568672915267/scenarios/<name>/template.yaml. - ai-contact-centre packaging gains a verification step: greps the packaged template for s3:// references and fails if any point at a bucket other than the blueprints bucket. Catches a sam-package regression where the --s3-bucket flag gets dropped and CodeUri silently lands in samclisourcebucket. What's deferred: - synth-bops-planning: deferred to Phase 2b. The bops-planning CDK creates a LogGroup with RemovalPolicy.RETAIN, deliberately, with the comment "RETAIN so we can debug failures after rollback" in compute.ts:48. The existing strict synth lint would fail this. Phase 2b introduces a justification-aware retention lint that handles intentional retention via Metadata.Justification on the resource, at which point the bops synth job can ship. The bops template.yaml is committed manually until then. T2a.1 audit: existing pipeline had synth jobs for localgov-drupal, localgov-ims, simply-readable, minute, fixmystreet, paperless-ngx. The 7 non-synth scenarios (council-chatbot, foi-redaction, planning-ai, smart-car-park, text-to-speech, quicksight-dashboard, all-demo) ship committed templates directly. With this PR, planx and digital-planning-register join the synth pipeline. T2a.6 verification (all 17 scenarios in S3) happens after the workflow runs on push to main; it is the merge-time success criterion for this PR.

Phase 2b of the scenario-regression smoke-pack tech-spec. What ships: - scripts/lint-retention-policies.sh: linter for synthesized CFN. Targets: DeletionPolicy/UpdateReplacePolicy=Retain, Properties.DeletionProtection=true, Properties.EnableDeletionProtection=true, Properties.FinalSnapshotIdentifier set. A resource is exempt if Metadata.Justification is non-empty. Plus a second-order cap (MAX_JUSTIFICATIONS, default 5) so the exemption mechanism doesn't degenerate into rubber-stamping. Plus existing CDK-residue checks (AssetParameters, cdk-bootstrap reference) and template-size limit (460KB). Handles both synth-output JSON-in-YAML and committed real YAML (via yq). - synth-bops-planning workflow job: now produces template.yaml in CI. The bops-planning LogGroup in the CDK source picks up a Metadata.Justification (via cfnOptions.metadata) explaining the RETAIN choice for debug-after- rollback so the new lint passes. The previously-committed bops template.yaml is removed; the synth job is now the single source of truth. - lint-committed-templates workflow job: top-level pass over the hand-authored CFN templates that have no synth step (council-chatbot, foi-redaction, planning-ai, smart-car-park, text-to-speech, quicksight-dashboard, ai-contact-centre, all-demo). - all-demo/template.yaml expanded from 7 to 16 nested scenarios. The 9 new nested stacks (simply-readable, ai-contact-centre, localgov-ims, minute, fixmystreet, paperless-ngx, planx, bops-planning, digital-planning-register) follow the existing pattern: TemplateURL via the blueprints-bucket convention, TimeoutInMinutes calibrated to observed deploy times, AppRegistry tag. Required parameters (GovUkPayApiKey, OSVectorTilesApiKey, DprImageUri, DprCouncilConfig) are exposed at the umbrella level with empty / sensible defaults so all-demo deploys cleanly without per-scenario credentials but remain overridable for full-functionality deploys. The Outputs block surfaces the primary URL + admin credentials for every new child stack. - Upload step in deploy job extends to cover the non-StackSet scenarios (planx, bops-planning, digital-planning-register, minute, fixmystreet, paperless-ngx) so their synthed templates reach scenarios/<name>/template.yaml in the blueprints bucket for all-demo to nest. Deferred: - T2b.4 (verification PR that introduces a DeletionPolicy: Retain without justification and confirms the lint fails) is documented in this PR but not opened as a separate PR; the lint script's local tests against synthetic templates and the real bops template cover the same surface. - T2b.5a (pre-deploy quota matrix) lands in the runbook via a follow-up PR to docs/smoke-test-account-setup.md once Phase 4 produces real usage data. - T2b.5b (manual all-demo deploy) is operator-driven against the smoke-test account once Phase 1b lands. Existing synth jobs (localgov-drupal, localgov-ims, simply-readable, minute, fixmystreet, paperless-ngx, planx, digital-planning-register) keep their inline strict synth-time lint, which still works for them because none of those templates carry retain. Migration to the new lint script for those jobs is deferred to a follow-up PR (zero behavioural change required).

…ase 3) Phase 3 of the scenario-regression smoke-pack tech-spec. Lays the rails so Phase 4 can ship one PR per scenario without re-deriving the workflow shape. Ships: - playwright.config.ts: adds the `smoke` project gated on PLAYWRIGHT_SUITE=smoke. No webServer, no baseURL, trace=retain-on-failure. Existing desktop/mobile projects keep their tests and webServer. - tests/smoke/fixtures/cfn-outputs.ts: SDK-v3 DescribeStacks helper with the secret-redaction contract from the spec. Output keys matching the sensitivity regex (broad: Password / Secret / Token / Credentials / Creds / Login / ApiKey / ConnectionString / PrivateKey / Passphrase) return a SensitiveValue whose toString / inspect / Symbol.toPrimitive emit a REDACTED placeholder. Cleartext only flows via the explicit sensitiveValue() accessor; never logged. - tests/smoke/fixtures/assertion-bar.ts: AssertionBarRow type + empty Map. Phase 4 PRs populate one row per scenario citing the historical regression that informed featureFlow. - tests/smoke/fixtures/secure-form.ts: fillPassword(page, selector, value) wraps page.fill so the Playwright trace records a REDACTED-<sha> hash instead of the cleartext form-encoded password value. - scripts/smoke.sh: identical-invocation contract for local + CI. Required env vars asserted at top with helpful errors. Local = SMOKE_AWS_PROFILE SSO; CI = OIDC credentials already exported by configure-aws-credentials upstream. - .env.example: documents the smoke env vars. - package.json: adds the test:smoke script. - .github/workflows/smoke.yml: the smoke workflow. Trigger matrix (PR-scoped, nightly cron, push-to-main, workflow_dispatch); scope pre-flight that flips full-vs-scoped based on changed paths; configure-aws-credentials via OIDC against the role committed in docs/smoke-test-account-config.yml; deployment-environment gate (smoke-test-deploy); SCP drift check with issue-body counter and fail-at-7 escalation; pre-deploy state check with auto-recovery (continue-update-rollback / retry-delete / recovery stack name); quarantine-expiry check parsing assertion-bar.ts; deploy of all-demo (the smoke account's authoritative target); smoke-pack execution; CFN events captured into the artefact bundle; artefact upload (Playwright + CFN events + 30-day retention) BEFORE teardown so live-stack state is recorded; teardown with 3 × 60s retry then stranded-stack issue; cron- -only smoke-failed issue. - .github/CODEOWNERS: requires review on the sensitive smoke paths so a PR cannot run with deploy credentials until a CODEOWNERS reviewer approves the deployment environment. - .github/workflows/quarterly-audit.yml: opens a tracking issue every 3 months covering the six audit items in the runbook's Operational Notes (spend, orphan sweep, deploy-role policy drift, SCP drift, Renovate liveness, ProtectISB-fallback revisit). Plus a daily auto-escalation step that nudges any quarterly-audit issue open >30 days. Deferred: - T3.8 (end-to-end rails integration test): requires Phase 1b's smoke account to exist and docs/smoke-test-account-config.yml to have real values. Workflow_dispatch verification is the post-1b acceptance gate; this PR ships the rails ready for that gate. - Phase 4 per-scenario specs come in 17 follow-up PRs. The 'co-cddo/ndx-try-maintainers' team referenced in CODEOWNERS is a placeholder; when the team is provisioned, the same path patterns will work. Until then branch-protection + standard PR review remains the operative gate.

Phase 4 of the scenario-regression smoke-pack tech-spec. Ships the top-3 specs in the priority order from the spec (Phase 4 § "priority order"): the three scenarios with the longest historical-iteration tails per memory. Each spec follows the rails set in Phase 3 (configurable stack name, cfn-outputs.fetchStackOutputs with secret redaction, fillPassword for credential-entry sites, requireAssertionBar lookup) and exercises a bug-informed feature flow that touches the codepath of a historical regression for that scenario: - fixmystreet.spec.ts: post-login, visits /reports (catches the bin/update-all-reports regression) and /admin (catches the must_have_2fa / STAGING_FLAGS skip_must_have_2fa regression). Landing assertion also greps for `:9000` URLs to catch the ALB sidecar mis-routing regression. - planx.spec.ts: post-login, asserts the SPA boot is free of the domain-allowlist error overlay (catches window.location.host / cloudfront.net regression) and Airbrake error overlay (catches the VITE_APP_ENV=production regression). Hits Hasura native /v1/version (catches the Caddy-elimination regression). - minute.spec.ts: uses Playwright browser.newContext with httpCredentials to attach basic auth as proper Authorization headers (NOT URL-embedded, which is the regression). Inside the SPA, evaluates a fetch() against /health (catches the CloudFront-embedded-basic-auth regression that breaks modern-browser fetch) and a fetch() against /api/proxy/healthcheck (catches the ALB /api/* interception regression). assertion-bar.ts populated for the three rows with explicit citations back to the memory files that informed each featureFlow. The remaining 14 scenarios (localgov-ims, localgov-drupal, simply-readable, ai-contact-centre, paperless-ngx, council-chatbot, bops-planning, digital-planning-register, quicksight-dashboard, foi-redaction, planning-ai, text-to-speech, smart-car-park, all-demo) are NOT included in this PR; they will land as follow-up PRs once Phase 1b provides a smoke account operators can iterate against. The Phase 4 DoD's 14/17-minimum bar plus ≤3-quarantined cap will be revisited as those PRs land. Until then the smoke workflow self-disables on placeholder config in docs/smoke-test-account-config.yml, so this PR is independently mergeable.

…dynamic) Address review findings against PRs #230 and #235: 1. ai-contact-centre PSTN regex was /^\+\d{6,}/ which accepted ANY international number. Catches "+1234567" as a UK +44 800 claim, which defeats the point. Tighten to match UK +44 toll-free (800/808/3xx) and common landline prefixes, or US toll-free fallback (+1 8xx). Strip whitespace before matching since AWS Connect sometimes returns "+44 800 ..." formatted output. 2. council-chatbot smoke did a GET against the chatbot FunctionURL. Chat-style Lambdas typically accept POST only, returning 405 Method Not Allowed on a GET — which is < 500, so the spec passed vacuously without proving the Lambda even booted. Change to POST with a minimal valid chat body so the Lambda must boot and reach Bedrock; 5xx now signals a real regression (most likely the legacy claude-3- haiku migration in NAP-548). 3. all-demo umbrella spec hardcoded 25 Output keys. If a new Output is added or one is renamed, the spec drifts from the template. Replace with a discoverAllDemoOutputKeys() function that regex-parses the committed template at test time. Adding new Outputs to the template no longer requires updating the spec; renaming one lights it up here. URL-output detection moves to a /Url$|URL$/ pattern instead of a hardcoded list. 4. cfn-outputs.ts: clarify that CloudFormation's DescribeStacks API does NOT return Output Metadata, so the spec's Metadata: { Sensitive: true } opt-in is unimplementable via this path. The regex is the sole signal; the spec's "audit pass after Phase 4" should grow the regex when it identifies outputs that need redaction but aren't matched yet.

chrisns · 2026-05-12T12:26:59Z

Phase 4 top-3 PR; the spec-fix work for adversarial review landed on the Phase-4-remaining-14 branch (PR #235) since it amends shared fixtures. PSTN regex, council-chatbot POST verb, all-demo dynamic discovery, cfn-outputs Metadata.Sensitive note.

chrisns · 2026-05-12T16:16:22Z

Superseded by #236 — single squashed PR for all 6 phases. Dependency chain across 9 PRs made parallel review worse than serial; one PR with one review and one CI run is simpler.

@chrisns

Implements every phase of _bmad-output/implementation-artifacts/tech-spec-scenario-regression-smoke-pack.md in a single deliverable. The original spec called for one PR per phase (8+ PRs); experience showed the dependency overlap made that worse for review, not better, so this squashes #226 / #227 / #228 / #229 / #230 / #231 / #232 / #233 / #235 into a single change. What ships ========== Phase 1a — runbook + config schema - docs/smoke-test-account-setup.md: one-off manual procedure for vending the long-lived smoke-test AWS account, with the four required sections (Prerequisites / Procedure / Verification / Operational Notes). Per-step idempotency checks + inverses; ProtectISB role-creation canary + fallback branch (ADR-1); Bedrock model-access enablement + gotchas (legacy claude-3-haiku-20240307 retired, Nova body shape); service-quota targets; QuickSight decision; iterate-to-least-privilege protocol for the inline IAM policy. - docs/smoke-test-account-config.yml: post-runbook state record schema. Phase 1b — operator-executed account state - Smoke account 464453619983 provisioned in NDX org under the fallback branch (ProtectISB canary failed; account moved to root with Restrictions SCP attached directly). AwsNuke SCP intentionally NOT attached (it blocks sts:AssumeRoleWithWebIdentity and we use CFN delete + retention lint, not aws-nuke). - OIDC provider + InnovationSandbox-ndx-SmokeTestDeployRole created with 6h max-session-duration. Trust policy uses sub-pattern lock (`repo:co-cddo/ndx_try_aws_scenarios:*`) + aud condition; the repository_owner claim condition is omitted because it reproducibly breaks the assume even though the OIDC token contains the claim (verified via JWT decode in an investigation workflow that has since been deleted; see runbook Step 10). - expected_scps reflects live state: Restrictions + FullAWSAccess. Phase 2a — synth pipelines for missing scenarios - New synth jobs in .github/workflows/deploy-blueprints.yml for planx and digital-planning-register (CDK -> template.yaml -> S3 via the existing isb-hub upload chain). - bops-planning synth job lands in Phase 2b after the retention lint is justification-aware. - ai-contact-centre: new "verify packaged CodeUri targets blueprints bucket" step catches a sam-package regression where --s3-bucket would silently land in the SAM default bucket. Phase 2b — all-demo expansion + retention lint - cloudformation/scenarios/all-demo/template.yaml expanded from 7 to 16 nested scenarios (Minute, FixMyStreet, AI Contact Centre, LocalGov IMS, Paperless-ngx, PlanX, Bops Planning, Simply Readable, Digital Planning Register). Umbrella parameters for credentials (GovUkPayApiKey, OSVectorTilesApiKey, DprImageUri, DprCouncilConfig) with overridable empty / sensible defaults; per-scenario URL + admin-credential Outputs surfaced. - scripts/lint-retention-policies.sh: forbids DeletionPolicy=Retain / UpdateReplacePolicy=Retain / Properties.DeletionProtection=true / Properties.EnableDeletionProtection=true / Properties.FinalSnapshotIdentifier unless the resource carries a non-empty Metadata.Justification. Per-template cap (default 3) + global cap (default 10) so any one scenario can't pencil-whip retentions repo-wide. - lint-committed-templates job in deploy-blueprints.yml runs the lint over hand-authored CFN templates. - bops-planning's LogGroup keeps RemovalPolicy.RETAIN (deliberate debug-after-rollback) with a Metadata.Justification attached via cfnOptions; bops synth job re-enabled. Phase 3 — smoke rails - playwright.config.ts: new 'smoke' project gated on PLAYWRIGHT_SUITE=smoke. - tests/smoke/fixtures/cfn-outputs.ts: SDK-v3 DescribeStacks helper. Sensitive output values flow only via explicit sensitiveValue() accessor; toString / inspect / Symbol.toPrimitive emit REDACTED placeholder. Documents the CloudFormation-API limitation that Output Metadata.Sensitive opt-in isn't readable (regex is the sole signal). - tests/smoke/fixtures/assertion-bar.ts: 17 AssertionBarRow entries populated. - tests/smoke/fixtures/secure-form.ts: fillPassword wrapper redacts form-encoded passwords from Playwright trace. - scripts/smoke.sh + .env.example: local + CI identical invocation. - .github/workflows/smoke.yml: trigger matrix (PR-scoped / nightly cron / push-to-main / workflow_dispatch); scope decides full vs scoped from changed paths; global serial concurrency (no cancel-in-progress — cancelled runs leave orphan AWS state); configure-aws-credentials with role-duration-seconds=21600 (6h) to match the role's max-session-duration; pre-deploy state check with auto-recovery for stranded stacks; SCP drift check (excluding FullAWSAccess, fail-soft for first 7 detections); quarantine-expiry check; CFN events captured BEFORE teardown; teardown with 3x60s retry, gated on aws-creds outcome so we don't burn 3min retrying without credentials. - .github/workflows/quarterly-audit.yml: 3-monthly tracking issue (spend, orphan sweep, deploy-role policy drift, SCP drift, Renovate liveness, ProtectISB-fallback revisit). - .github/CODEOWNERS: smoke-pack sensitive paths require @chrisns review (until a maintainers team is provisioned). Phase 4 — 17 per-scenario smoke specs - One spec per scenario covering the auth-mode pattern (admin-login / public / sso-skip / umbrella). Bug-informed feature flows cite the historical regression that informed each test: - fixmystreet: /reports requires bin/update-all-reports; /admin must reach the dashboard without 2FA redirect - planx: SPA boots free of domain-allowlist / Airbrake errors; Hasura native /v1/version responds (Caddy elimination) - minute: magic-link sets cookie; same-origin fetch() works post-auth; /api/proxy/healthcheck reaches the backend (catches the basic-auth-breaks-fetch() regression and the ALB /api/* interception regression) - localgov-ims: Windows IIS multi-site routing; AdminPassword must not be the literal {{resolve:...}} token (catches the Lambda-custom-resource regression) - localgov-drupal: ndx_aws_ai module boots without Bedrock AccessDeniedException - simply-readable: SPA loads, credentials non-empty + non-token; reload produces no 5xx responses (catches BlueprintsBucketName mis-wire) - ai-contact-centre: PSTN claim matches UK toll-free / landline OR US toll-free (catches international fallback regression) - paperless-ngx: /documents view + /api/documents/ respond (S3 Files mount integrity) - bops-planning: post-login URL is NOT on the Applicants port (catches the routing.rb single-tenant override regression) - digital-planning-register: register loads with planning markers - public-Lambda scenarios (foi-redaction, planning-ai, smart-car-park, text-to-speech, council-chatbot): FunctionURL not-5xx + not-403 (catches the InvokeFunctionUrl + InvokeFunction dual-permission regression); council-chatbot uses POST not GET so the test isn't vacuous against a POST-only Lambda - quicksight-dashboard: landing + outputs only (sso-skip per auth- mode categorisation) - all-demo: discovers Output keys dynamically by parsing the committed template at test time; asserts every Output present, non-empty, and not the {{resolve:...}} literal; URL outputs match https?:// Phase 5 — pin every floating image tag - 10 own-GHCR images (fixmystreet, localgov_drupal, minute_*, planx-*, dpr) pinned to sha-<7chars>@sha256:<digest>. - 2 upstream images (docker.io/apache/tika 3.3.0.0-full, ghcr.io/paperless-ngx/paperless-ngx 2.9) pinned to <tag>@sha256:<digest>. - Removed legacy cloudformation/scenarios/minute/template.json (stale ECR references; nothing in the repo referenced it). Phase 6 — Renovate adoption (replaces Dependabot) - renovate.json: 6 group rules per the spec's pinning-strategy table; customManagers regex matching the new pin shape; osvVulnerabilityAlerts + security-priority group; pinDigests scoped to official actions/* + aws-actions/* only so the first run doesn't firehose; per-PR limits capped at 6. - .github/workflows/renovate.yml: twice-daily + workflow_dispatch. Action pinned by digest to v46.1.14. - .github/dependabot.yml deleted. Operator follow-ups (not in this PR) ==================================== - NAP-548: migrate scenarios off legacy claude-3-haiku-20240307 - NAP-549: revisit ProtectISB fallback by 2026-11-12 - NAP-550: service-quota Console requests - NAP-551: QuickSight subscription decision - NAP-552: mint RENOVATE_TOKEN repo secret - NAP-554: close in-flight Dependabot PRs - NAP-555: T2b.5b + T3.8 end-to-end verifications Closes: #226, #227, #228, #229, #230, #231, #232, #233, #235.

@chrisns

Implements every phase of _bmad-output/implementation-artifacts/tech-spec-scenario-regression-smoke-pack.md in a single deliverable. The original spec called for one PR per phase (8+ PRs); experience showed the dependency overlap made that worse for review, not better, so this squashes #226 / #227 / #228 / #229 / #230 / #231 / #232 / #233 / #235 into a single change. What ships ========== Phase 1a — runbook + config schema - docs/smoke-test-account-setup.md: one-off manual procedure for vending the long-lived smoke-test AWS account, with the four required sections (Prerequisites / Procedure / Verification / Operational Notes). Per-step idempotency checks + inverses; ProtectISB role-creation canary + fallback branch (ADR-1); Bedrock model-access enablement + gotchas (legacy claude-3-haiku-20240307 retired, Nova body shape); service-quota targets; QuickSight decision; iterate-to-least-privilege protocol for the inline IAM policy. - docs/smoke-test-account-config.yml: post-runbook state record schema. Phase 1b — operator-executed account state - Smoke account 464453619983 provisioned in NDX org under the fallback branch (ProtectISB canary failed; account moved to root with Restrictions SCP attached directly). AwsNuke SCP intentionally NOT attached (it blocks sts:AssumeRoleWithWebIdentity and we use CFN delete + retention lint, not aws-nuke). - OIDC provider + InnovationSandbox-ndx-SmokeTestDeployRole created with 6h max-session-duration. Trust policy uses sub-pattern lock (`repo:co-cddo/ndx_try_aws_scenarios:*`) + aud condition; the repository_owner claim condition is omitted because it reproducibly breaks the assume even though the OIDC token contains the claim (verified via JWT decode in an investigation workflow that has since been deleted; see runbook Step 10). - expected_scps reflects live state: Restrictions + FullAWSAccess. Phase 2a — synth pipelines for missing scenarios - New synth jobs in .github/workflows/deploy-blueprints.yml for planx and digital-planning-register (CDK -> template.yaml -> S3 via the existing isb-hub upload chain). - bops-planning synth job lands in Phase 2b after the retention lint is justification-aware. - ai-contact-centre: new "verify packaged CodeUri targets blueprints bucket" step catches a sam-package regression where --s3-bucket would silently land in the SAM default bucket. Phase 2b — all-demo expansion + retention lint - cloudformation/scenarios/all-demo/template.yaml expanded from 7 to 16 nested scenarios (Minute, FixMyStreet, AI Contact Centre, LocalGov IMS, Paperless-ngx, PlanX, Bops Planning, Simply Readable, Digital Planning Register). Umbrella parameters for credentials (GovUkPayApiKey, OSVectorTilesApiKey, DprImageUri, DprCouncilConfig) with overridable empty / sensible defaults; per-scenario URL + admin-credential Outputs surfaced. - scripts/lint-retention-policies.sh: forbids DeletionPolicy=Retain / UpdateReplacePolicy=Retain / Properties.DeletionProtection=true / Properties.EnableDeletionProtection=true / Properties.FinalSnapshotIdentifier unless the resource carries a non-empty Metadata.Justification. Per-template cap (default 3) + global cap (default 10) so any one scenario can't pencil-whip retentions repo-wide. - lint-committed-templates job in deploy-blueprints.yml runs the lint over hand-authored CFN templates. - bops-planning's LogGroup keeps RemovalPolicy.RETAIN (deliberate debug-after-rollback) with a Metadata.Justification attached via cfnOptions; bops synth job re-enabled. Phase 3 — smoke rails - playwright.config.ts: new 'smoke' project gated on PLAYWRIGHT_SUITE=smoke. - tests/smoke/fixtures/cfn-outputs.ts: SDK-v3 DescribeStacks helper. Sensitive output values flow only via explicit sensitiveValue() accessor; toString / inspect / Symbol.toPrimitive emit REDACTED placeholder. Documents the CloudFormation-API limitation that Output Metadata.Sensitive opt-in isn't readable (regex is the sole signal). - tests/smoke/fixtures/assertion-bar.ts: 17 AssertionBarRow entries populated. - tests/smoke/fixtures/secure-form.ts: fillPassword wrapper redacts form-encoded passwords from Playwright trace. - scripts/smoke.sh + .env.example: local + CI identical invocation. - .github/workflows/smoke.yml: trigger matrix (PR-scoped / nightly cron / push-to-main / workflow_dispatch); scope decides full vs scoped from changed paths; global serial concurrency (no cancel-in-progress — cancelled runs leave orphan AWS state); configure-aws-credentials with role-duration-seconds=21600 (6h) to match the role's max-session-duration; pre-deploy state check with auto-recovery for stranded stacks; SCP drift check (excluding FullAWSAccess, fail-soft for first 7 detections); quarantine-expiry check; CFN events captured BEFORE teardown; teardown with 3x60s retry, gated on aws-creds outcome so we don't burn 3min retrying without credentials. - .github/workflows/quarterly-audit.yml: 3-monthly tracking issue (spend, orphan sweep, deploy-role policy drift, SCP drift, Renovate liveness, ProtectISB-fallback revisit). - .github/CODEOWNERS: smoke-pack sensitive paths require @chrisns review (until a maintainers team is provisioned). Phase 4 — 17 per-scenario smoke specs - One spec per scenario covering the auth-mode pattern (admin-login / public / sso-skip / umbrella). Bug-informed feature flows cite the historical regression that informed each test: - fixmystreet: /reports requires bin/update-all-reports; /admin must reach the dashboard without 2FA redirect - planx: SPA boots free of domain-allowlist / Airbrake errors; Hasura native /v1/version responds (Caddy elimination) - minute: magic-link sets cookie; same-origin fetch() works post-auth; /api/proxy/healthcheck reaches the backend (catches the basic-auth-breaks-fetch() regression and the ALB /api/* interception regression) - localgov-ims: Windows IIS multi-site routing; AdminPassword must not be the literal {{resolve:...}} token (catches the Lambda-custom-resource regression) - localgov-drupal: ndx_aws_ai module boots without Bedrock AccessDeniedException - simply-readable: SPA loads, credentials non-empty + non-token; reload produces no 5xx responses (catches BlueprintsBucketName mis-wire) - ai-contact-centre: PSTN claim matches UK toll-free / landline OR US toll-free (catches international fallback regression) - paperless-ngx: /documents view + /api/documents/ respond (S3 Files mount integrity) - bops-planning: post-login URL is NOT on the Applicants port (catches the routing.rb single-tenant override regression) - digital-planning-register: register loads with planning markers - public-Lambda scenarios (foi-redaction, planning-ai, smart-car-park, text-to-speech, council-chatbot): FunctionURL not-5xx + not-403 (catches the InvokeFunctionUrl + InvokeFunction dual-permission regression); council-chatbot uses POST not GET so the test isn't vacuous against a POST-only Lambda - quicksight-dashboard: landing + outputs only (sso-skip per auth- mode categorisation) - all-demo: discovers Output keys dynamically by parsing the committed template at test time; asserts every Output present, non-empty, and not the {{resolve:...}} literal; URL outputs match https?:// Phase 5 — pin every floating image tag - 10 own-GHCR images (fixmystreet, localgov_drupal, minute_*, planx-*, dpr) pinned to sha-<7chars>@sha256:<digest>. - 2 upstream images (docker.io/apache/tika 3.3.0.0-full, ghcr.io/paperless-ngx/paperless-ngx 2.9) pinned to <tag>@sha256:<digest>. - Removed legacy cloudformation/scenarios/minute/template.json (stale ECR references; nothing in the repo referenced it). Phase 6 — Renovate adoption (replaces Dependabot) - renovate.json: 6 group rules per the spec's pinning-strategy table; customManagers regex matching the new pin shape; osvVulnerabilityAlerts + security-priority group; pinDigests scoped to official actions/* + aws-actions/* only so the first run doesn't firehose; per-PR limits capped at 6. - .github/workflows/renovate.yml: twice-daily + workflow_dispatch. Action pinned by digest to v46.1.14. - .github/dependabot.yml deleted. Operator follow-ups (not in this PR) ==================================== - NAP-548: migrate scenarios off legacy claude-3-haiku-20240307 - NAP-549: revisit ProtectISB fallback by 2026-11-12 - NAP-550: service-quota Console requests - NAP-551: QuickSight subscription decision - NAP-552: mint RENOVATE_TOKEN repo secret - NAP-554: close in-flight Dependabot PRs - NAP-555: T2b.5b + T3.8 end-to-end verifications Closes: #226, #227, #228, #229, #230, #231, #232, #233, #235.

@chrisns

Implements every phase of _bmad-output/implementation-artifacts/tech-spec-scenario-regression-smoke-pack.md in a single deliverable. The original spec called for one PR per phase (8+ PRs); experience showed the dependency overlap made that worse for review, not better, so this squashes #226 / #227 / #228 / #229 / #230 / #231 / #232 / #233 / #235 into a single change. What ships ========== Phase 1a — runbook + config schema - docs/smoke-test-account-setup.md: one-off manual procedure for vending the long-lived smoke-test AWS account, with the four required sections (Prerequisites / Procedure / Verification / Operational Notes). Per-step idempotency checks + inverses; ProtectISB role-creation canary + fallback branch (ADR-1); Bedrock model-access enablement + gotchas (legacy claude-3-haiku-20240307 retired, Nova body shape); service-quota targets; QuickSight decision; iterate-to-least-privilege protocol for the inline IAM policy. - docs/smoke-test-account-config.yml: post-runbook state record schema. Phase 1b — operator-executed account state - Smoke account 464453619983 provisioned in NDX org under the fallback branch (ProtectISB canary failed; account moved to root with Restrictions SCP attached directly). AwsNuke SCP intentionally NOT attached (it blocks sts:AssumeRoleWithWebIdentity and we use CFN delete + retention lint, not aws-nuke). - OIDC provider + InnovationSandbox-ndx-SmokeTestDeployRole created with 6h max-session-duration. Trust policy uses sub-pattern lock (`repo:co-cddo/ndx_try_aws_scenarios:*`) + aud condition; the repository_owner claim condition is omitted because it reproducibly breaks the assume even though the OIDC token contains the claim (verified via JWT decode in an investigation workflow that has since been deleted; see runbook Step 10). - expected_scps reflects live state: Restrictions + FullAWSAccess. Phase 2a — synth pipelines for missing scenarios - New synth jobs in .github/workflows/deploy-blueprints.yml for planx and digital-planning-register (CDK -> template.yaml -> S3 via the existing isb-hub upload chain). - bops-planning synth job lands in Phase 2b after the retention lint is justification-aware. - ai-contact-centre: new "verify packaged CodeUri targets blueprints bucket" step catches a sam-package regression where --s3-bucket would silently land in the SAM default bucket. Phase 2b — all-demo expansion + retention lint - cloudformation/scenarios/all-demo/template.yaml expanded from 7 to 16 nested scenarios (Minute, FixMyStreet, AI Contact Centre, LocalGov IMS, Paperless-ngx, PlanX, Bops Planning, Simply Readable, Digital Planning Register). Umbrella parameters for credentials (GovUkPayApiKey, OSVectorTilesApiKey, DprImageUri, DprCouncilConfig) with overridable empty / sensible defaults; per-scenario URL + admin-credential Outputs surfaced. - scripts/lint-retention-policies.sh: forbids DeletionPolicy=Retain / UpdateReplacePolicy=Retain / Properties.DeletionProtection=true / Properties.EnableDeletionProtection=true / Properties.FinalSnapshotIdentifier unless the resource carries a non-empty Metadata.Justification. Per-template cap (default 3) + global cap (default 10) so any one scenario can't pencil-whip retentions repo-wide. - lint-committed-templates job in deploy-blueprints.yml runs the lint over hand-authored CFN templates. - bops-planning's LogGroup keeps RemovalPolicy.RETAIN (deliberate debug-after-rollback) with a Metadata.Justification attached via cfnOptions; bops synth job re-enabled. Phase 3 — smoke rails - playwright.config.ts: new 'smoke' project gated on PLAYWRIGHT_SUITE=smoke. - tests/smoke/fixtures/cfn-outputs.ts: SDK-v3 DescribeStacks helper. Sensitive output values flow only via explicit sensitiveValue() accessor; toString / inspect / Symbol.toPrimitive emit REDACTED placeholder. Documents the CloudFormation-API limitation that Output Metadata.Sensitive opt-in isn't readable (regex is the sole signal). - tests/smoke/fixtures/assertion-bar.ts: 17 AssertionBarRow entries populated. - tests/smoke/fixtures/secure-form.ts: fillPassword wrapper redacts form-encoded passwords from Playwright trace. - scripts/smoke.sh + .env.example: local + CI identical invocation. - .github/workflows/smoke.yml: trigger matrix (PR-scoped / nightly cron / push-to-main / workflow_dispatch); scope decides full vs scoped from changed paths; global serial concurrency (no cancel-in-progress — cancelled runs leave orphan AWS state); configure-aws-credentials with role-duration-seconds=21600 (6h) to match the role's max-session-duration; pre-deploy state check with auto-recovery for stranded stacks; SCP drift check (excluding FullAWSAccess, fail-soft for first 7 detections); quarantine-expiry check; CFN events captured BEFORE teardown; teardown with 3x60s retry, gated on aws-creds outcome so we don't burn 3min retrying without credentials. - .github/workflows/quarterly-audit.yml: 3-monthly tracking issue (spend, orphan sweep, deploy-role policy drift, SCP drift, Renovate liveness, ProtectISB-fallback revisit). - .github/CODEOWNERS: smoke-pack sensitive paths require @chrisns review (until a maintainers team is provisioned). Phase 4 — 17 per-scenario smoke specs - One spec per scenario covering the auth-mode pattern (admin-login / public / sso-skip / umbrella). Bug-informed feature flows cite the historical regression that informed each test: - fixmystreet: /reports requires bin/update-all-reports; /admin must reach the dashboard without 2FA redirect - planx: SPA boots free of domain-allowlist / Airbrake errors; Hasura native /v1/version responds (Caddy elimination) - minute: magic-link sets cookie; same-origin fetch() works post-auth; /api/proxy/healthcheck reaches the backend (catches the basic-auth-breaks-fetch() regression and the ALB /api/* interception regression) - localgov-ims: Windows IIS multi-site routing; AdminPassword must not be the literal {{resolve:...}} token (catches the Lambda-custom-resource regression) - localgov-drupal: ndx_aws_ai module boots without Bedrock AccessDeniedException - simply-readable: SPA loads, credentials non-empty + non-token; reload produces no 5xx responses (catches BlueprintsBucketName mis-wire) - ai-contact-centre: PSTN claim matches UK toll-free / landline OR US toll-free (catches international fallback regression) - paperless-ngx: /documents view + /api/documents/ respond (S3 Files mount integrity) - bops-planning: post-login URL is NOT on the Applicants port (catches the routing.rb single-tenant override regression) - digital-planning-register: register loads with planning markers - public-Lambda scenarios (foi-redaction, planning-ai, smart-car-park, text-to-speech, council-chatbot): FunctionURL not-5xx + not-403 (catches the InvokeFunctionUrl + InvokeFunction dual-permission regression); council-chatbot uses POST not GET so the test isn't vacuous against a POST-only Lambda - quicksight-dashboard: landing + outputs only (sso-skip per auth- mode categorisation) - all-demo: discovers Output keys dynamically by parsing the committed template at test time; asserts every Output present, non-empty, and not the {{resolve:...}} literal; URL outputs match https?:// Phase 5 — pin every floating image tag - 10 own-GHCR images (fixmystreet, localgov_drupal, minute_*, planx-*, dpr) pinned to sha-<7chars>@sha256:<digest>. - 2 upstream images (docker.io/apache/tika 3.3.0.0-full, ghcr.io/paperless-ngx/paperless-ngx 2.9) pinned to <tag>@sha256:<digest>. - Removed legacy cloudformation/scenarios/minute/template.json (stale ECR references; nothing in the repo referenced it). Phase 6 — Renovate adoption (replaces Dependabot) - renovate.json: 6 group rules per the spec's pinning-strategy table; customManagers regex matching the new pin shape; osvVulnerabilityAlerts + security-priority group; pinDigests scoped to official actions/* + aws-actions/* only so the first run doesn't firehose; per-PR limits capped at 6. - .github/workflows/renovate.yml: twice-daily + workflow_dispatch. Action pinned by digest to v46.1.14. - .github/dependabot.yml deleted. Operator follow-ups (not in this PR) ==================================== - NAP-548: migrate scenarios off legacy claude-3-haiku-20240307 - NAP-549: revisit ProtectISB fallback by 2026-11-12 - NAP-550: service-quota Console requests - NAP-551: QuickSight subscription decision - NAP-552: mint RENOVATE_TOKEN repo secret - NAP-554: close in-flight Dependabot PRs - NAP-555: T2b.5b + T3.8 end-to-end verifications Closes: #226, #227, #228, #229, #230, #231, #232, #233, #235.

chrisns added 4 commits May 12, 2026 10:34

chrisns had a problem deploying to smoke-test-deploy May 12, 2026 09:59 — with GitHub Actions Failure

chrisns mentioned this pull request May 12, 2026

tests: smoke specs for remaining 14 scenarios (Phase 4 complete) #235

Closed

3 tasks

chrisns mentioned this pull request May 12, 2026

scenario-regression smoke pack (Phases 1-6 squashed) #236

Closed

5 tasks

chrisns closed this May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests: smoke specs for fixmystreet, planx, minute (Phase 4 top-3)#230

tests: smoke specs for fixmystreet, planx, minute (Phase 4 top-3)#230
chrisns wants to merge 4 commits into
mainfrom
feat/smoke-specs-top3

chrisns commented May 12, 2026

Uh oh!

chrisns commented May 12, 2026

Uh oh!

chrisns commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chrisns commented May 12, 2026

Summary

Test plan

Uh oh!

chrisns commented May 12, 2026

Uh oh!

chrisns commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant