feat(demo): public demo mode for demo.opensop.ai#29
Open
Chosen9115 wants to merge 10 commits into
Open
Conversation
…nfig Adds the foundation for a public OpenSOP demo at demo.opensop.ai: - `Opensop::DemoMode` module: env-driven `enabled?` flag, `api_token` accessor, and reset-schedule constants used as the single source of truth for the demo wiring. - Four new sample SOPs under `processes/examples/`: expense-approval, support-ticket-triage, release-deploy, agent-pr-review. Each exercises a different mix of step types (form / automated / judgment / approval / webhook / notification) and is honest in its header comment about which steps are stubbed in v0.1. - Three accompanying step scripts (categorize-ticket, stamp-release, post-pr-comment) follow the existing stdin/stdout JSON convention. - `fly.demo.toml` is a separate Fly config for the `opensop-demo` app — keeps the existing `fly.toml` untouched and makes `flyctl deploy --config fly.demo.toml` the only way to ship the demo. - `docs/demo-deploy.md` is the runbook for provisioning the demo (Fly app + Postgres, secrets, cert, DNS handoff to the user).
…et job Builds the runtime layer of the demo on top of the Wave 1 scaffold: - `Demo::SeedLoader` (idempotent) loads every YAML under processes/examples/ via `Opensop::Registry.load_all`. Validates that `OPENSOP_API_TOKEN` matches the homepage-displayed `Opensop::DemoMode.api_token` and warns on mismatch — there is no `ApiToken` model; auth is env-based. - `Demo::ResetJob` runs daily at 3:00 UTC via Solid Queue's recurring config. Truncates instance + step + event + callback tables in dependency order inside a transaction, then re-runs `Demo::SeedLoader`. - `DemoReadOnly` controller concern blocks process-definition mutations (`Sop::ProcessesController#register`) when DEMO_MODE is on. Instance lifecycle (start/submit/cancel) stays interactive — visitors need it. - rack-attack rules: 60 req/min/IP on `/sop/`, 10 starts/hr/IP on `/sop/*/start`, 120 req/min/IP on UI, plus a 5-min IP ban on > 1000 req in 5 min. All throttles no-op outside DEMO_MODE. - `Ui::DocsController` serves project docs (architecture, deploy, process-authoring, etc.) at `/docs(/*path)` rendered via Commonmarker with strict path-traversal protection. - `Ui::Demo::HomeController` is the front-door homepage for demo.opensop.ai — hero, copy-able API token, sample-process grid, CTAs to /docs and /api-docs. Routed conditionally so non-demo deploys still land on the existing dashboard. - `_demo_banner` partial — sticky amber banner rendered in both the application and docs layouts when DEMO_MODE is on. - `Opensop::DemoMode.api_token` now prefers `OPENSOP_API_TOKEN` so a single secret powers both auth and homepage display. 27/27 new specs pass.
Adds 23 specs covering the seam between the demo runtime and the process engine: - Service spec for `Demo::SeedLoader` (no-op outside DEMO_MODE, idempotent re-load, mismatch warning when OPENSOP_API_TOKEN is unset) - Job spec for `Demo::ResetJob` (no-op outside DEMO_MODE; clears all instances and reseeds processes when on) - Request spec for rack-attack throttling (11th `/sop/*/start` returns 429 with `rate_limit` body in DEMO_MODE; off-mode passes through). Swaps to a fresh `ActiveSupport::Cache::MemoryStore` per example because the test env's null cache store would otherwise suppress throttle counters. - View spec for the demo banner partial (renders only in DEMO_MODE, contains the localized message and GitHub link) - System smoke spec walking the homepage and one process card Brings the demo-feature spec total to 50 examples, all passing.
The admin UI is gated by HTTP-basic auth in production via OPENSOP_UI_USER / OPENSOP_UI_PASSWORD — enforced both at boot (config/initializers/admin_ui_auth.rb) and on every request (Ui::ApplicationController#authenticate_admin_ui!). For the public demo at demo.opensop.ai, the UI is intentionally public: visitors browse processes and drive instances without credentials. Mutation surfaces visitors can reach are already locked down by the DemoReadOnly concern (definition mutations → 403) and rack-attack throttles (per-IP limits + abuse ban). Adding a credential gate would force visitors to find published creds in a separate doc — which is worse than just trusting DemoReadOnly + rack-attack. Both gates now early-return when Opensop::DemoMode.enabled? is true. The error message in the boot guard now mentions DEMO_MODE as a valid deployment path so the next operator hits a helpful nudge. Caught when the first deploy of opensop-demo failed its db:prepare release command.
Lets us curl the deployed demo directly via the Fly hostname for post-deploy verification, independent of the demo.opensop.ai DNS + cert path.
The sidebar's Library section had two problems: 1. The template rendered every item as a `<div>` (never a link), even though the data layer exposed a `disabled: false` flag. So Templates, Webhooks, and API & SDK looked like links but did nothing on click. The Library template now mirrors the Workspace pattern — link_to for enabled items, div for disabled — so the disabled flag finally has teeth. 2. Templates and Webhooks render in the demo but only show empty/ passive content (the daily reset clears any callback receipts; templates is a read-only list of the same processes shown in the main grid). They're greyed with the existing "Coming soon" tooltip when DEMO_MODE is on. 3. The third Library slot now shows "Docs" → /docs (the project docs rendered from docs/*.md by Ui::DocsController) when DEMO_MODE is on, instead of "API & SDK" → /api-docs. Demo visitors get a broader first surface (architecture, process-authoring, deploy guides); the API reference is still one click away from the homepage CTA. Production behavior is unchanged for non-demo deploys: all three Library items remain `disabled: false` and now correctly render as clickable links to their respective controllers.
Demo went down today after a cascade: Postgres flaked → Solid Queue (in-Puma) lost its DB connection → Solid Queue's graceful-shutdown handler fired → Puma exited → port 3001 went dark. Fly never auto-restarted because `min_machines_running = 0` + `auto_stop_machines` doesn't trigger restart-on-unhealthy on idle machines. Two changes: - min_machines_running: 0 → 1, so Fly's standard restart-on-unhealthy applies and a flaky-Postgres burst can't park the app indefinitely. - grace_period: 10s → 30s. Cold boot of Rails 8 + Thruster + Solid Queue + bootsnap warm-up routinely exceeded 10s, which produced brief 502 bursts as the Fly proxy started routing before Puma was ready. Postgres health remains the underlying weakness (the demo's pg machine was in role:error / 3-of-3 critical when investigated). That needs separate follow-up — likely either a memory bump on the pg VM or a switch to managed-postgres. Tracked as a follow-up; this commit keeps the demo alive across the next Postgres flake. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…w2Ban Root cause of 2026-05-08 demo outage: every request was triggering 2-3 writes to Solid Cache (Postgres-backed) because rack-attack's `cache.store` was `Rails.cache`, AND the `Allow2Ban.filter(...)` block returned `true` unconditionally — counting every request toward the 1000/5min ban threshold whether the request was legitimate or not. That sustained PG-write pressure on a 256MB Postgres VM (free RAM was ~7MB at idle) drove the DB into `role: error`. Investigation memo at ~/Documents/coba-twin/postgres-flake-investigation-2026-05-09.md. Changes: - Switch rack-attack store to ActiveSupport::Cache::MemoryStore. Counters no longer touch Postgres at all. Reset on Puma restart, which is fine for demo abuse prevention (per-minute throttles still hold). - Remove the Allow2Ban block. The per-minute throttles (60 req/min /sop/ + 120 req/min UI = max ~900 req/5min/IP under perfect pacing) already bound below the 1000-threshold the ban was guarding. Comment in the file documents how to re-add a properly-scoped ban if needed. Pairs with a separate ops change: opensop-demo-db memory bumped from 256MB to 1024MB (`flyctl machine update --memory 1024`). The two fixes together address the cause (write storm) and the resilience floor (VM had no headroom for autovacuum or repmgr recovery). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Builds a public OpenSOP demo at demo.opensop.ai, gated by
DEMO_MODE=truewith zero behavioral change when the flag is unset.processes/examples/: existingcustomer-onboarding,lead-qualificationplus four new ones —expense-approval,support-ticket-triage,release-deploy,agent-pr-review. Each header comment is honest about which step types are stubbed in v0.1.Opensop::DemoModeis the single source of truth —enabled?,api_token(prefersOPENSOP_API_TOKENso one secret powers bothSop::ApplicationControllerauth and the homepage display), reset-schedule constants.Demo::SeedLoader+Demo::ResetJobkeep demo state idempotent and reset daily at 3:00 UTC via Solid Queue's recurring config.DemoReadOnlycontroller concern blocksSop::ProcessesController#registerin demo mode — instance lifecycle (start/submit/cancel) stays fully interactive./sop/, 10 starts/hr/IP, 120 req/min/IP on UI, 5-min ban on >1000 req/5min. All no-ops outside DEMO_MODE.Ui::DocsControllerat/docs(/*path)serves project markdown via Commonmarker with strict path-traversal guards. Distinct from the existing/api-docs(API reference).Ui::Demo::HomeControlleris the front-door homepage — hero, copy-able token, sample-process grid, CTAs to/docsand/api-docs. Routed conditionally so non-demo deploys still land on the existing dashboard.DEMO_MODEis on. Heroicon + i18n key paths per project conventions.fly.demo.tomlis a separate Fly config (appopensop-demo, regionord, 256MB shared-cpu-1x, Solid Queue in Puma) with the existingfly.tomluntouched.docs/demo-deploy.mdrunbook walks the operator through Fly app + Postgres provisioning, secrets, cert, and DNS handoff.Stats
Multi-agent execution (per the request)
Wave 1: 3 parallel Sonnet agents (sample SOPs, Fly config, DemoMode helper).
Wave 2: 4 parallel agents (Sonnet × 3 for SeedLoader/ResetJob, DocsController + layout, DemoReadOnly guard; Haiku × 1 for rack-attack initializer) then a final Sonnet agent for banner + homepage.
Wave 3: rails-tests specialist in green phase added 23 specs.
Audit: Opus.
Test plan
flyctl apps create opensop-demothen followdocs/demo-deploy.mdhttps://demo.opensop.ai/up(200) and/(homepage)curl -H "X-SOP-Token: demo-public-token-resets-daily" https://demo.opensop.ai/sop//docs/architectureand/docs/process-authoringDNS handoff to repo owner
After Fly provisioning, add a
CNAME demo → opensop-demo.fly.devto opensop.ai DNS.flyctl certs create demo.opensop.ai --app opensop-demoand wait ~5 min for cert validation.Things I deferred
@tailwindcss/typographyplugin: this app uses Tailwind v4 CSS-only and the plugin needs npm tooling that isn't wired here. Used arbitrary-selector classes ([&>h1]:text-3xl, etc.) on the docs<article>instead. Looks fine; can be upgraded later.demo_session_id.🤖 Generated with Claude Code