Skip to content

Commit bd47f71

Browse files
authored
Launch-quality output: 60fps fluency, cinematic pacing, generalized director, vocal-free music (#11)
* Snappier launch-film camera pacing + story-driven example recipes + refined music Camera pacing (src/render/plan.ts) — the output read as 'stays on a static image that lags' and 'randomly moving the cursor everywhere': - FOCUS_DWELL_MS 4200 -> 2400: a framed result is a 2s payoff, not a 4s frozen hold on now-static content - ESTABLISH_MS 1200 -> 900, ZOOM_DWELL_MS 1500 -> 1200: quicker in/out - MERGE_GAP_MS 2600 -> 3400, MERGE_DIST_FRAC 0.35 -> 0.5: nearby beats bridge into ONE continuous glide instead of pumping zoom in/out per click (the 'cursor wandering' feel) - ZOOM_TARGET 1.48 -> 1.42: keep a hair more context on small targets Spring, blur, crossfade, scroll, capture untouched. plan tests updated with recomputed frame numbers (intent preserved, not loosened). Example recipes now tell a story with visible-change interactions: - demo (Lumon): pitch -> click CTA -> type email -> Join (frames the success) -> cut to the live dashboard - pulse-demo: search a service -> payments-worker degraded, latency spiking -> click through the fleet, each re-animating the panel (was a dead hover that changed nothing) Refined bundled music: re-generated the four tracks as restrained launch-film underscores (no cheesy lead melodies/drops); extended to ~95-105s loopable beds. * Replace bundled music with synthesized, provably vocal-free tracks The AI-generated tracks kept smuggling in choral pads and vocal chops ('people yelling') despite instrumental flags. Replace them with tracks synthesized from pure oscillators in tools/synth-music.py — sub bass, synthesized kick, filtered-noise hats, saw pads, arpeggios. No vocal source exists in the signal chain, so vocals are impossible by construction. Four moods (pulse/daybreak/midnight/momentum), looped and normalized into ~90s beds. Provenance updated in CREDITS.md; music table in README updated. * Generalize the AI director so generate produces launch-quality stories for any app Live-testing generate against three different apps exposed defects that made real output storyless; all are now fixed at the pipeline level, not by tuning example recipes. Inventory (the biggest win — a dashboard filmed with 1 usable element before, 9 after): - Destructive-control filter no longer nukes content named after a scary word: it fires only on genuine action controls (button/link/submit), and ignores matches inside hyphen/underscore identifiers, so a service row 'checkout-api' or 'payments-worker' survives while a real Delete button is still excluded - data-testid is now used to BUILD selectors (ranked after id), so rows with live-ticking text no longer fall back to a has-text selector that breaks between read and verify; same-testid siblings each get a distinct :nth-match entry (capped) so the director can click different rows — the interaction that tells a dashboard's story - per-page theme + accent probe (dark/light via WCAG luminance) added to the digest so a text-only model can ground vibe/music in the app's look Director brain: - music_track is wired end-to-end: the analyze stage picks a bundled track from the app's look (enum-validated), script carries it, and generate resolves it (priority: --music > director pick > silent, a music miss never fails a run). Previously it hardcoded a nonexistent 'institutional-01' so every generate was silent - analyze + script prompts encode the story principles generically: every action must visibly change the screen, frame the result, hook → proof → payoff, pacing aligned to the new camera engine. All brand/ example tokens removed from prompts - selectors are shown between backticks and healed if the model copies the trailing [tag] annotation (a real failure on nth-match selectors), so a formatting slip self-heals instead of burning retries Verified live (DeepSeek): Pulse dashboard and a portfolio's project page both produce coherent 3-beat stories with the director choosing and muxing music unprompted. Tests 149 -> 164. * Address PR #11 review: destructive-filter safety, coerce, theme probe, music provenance Fixes the blocking safety regressions the reviewers caught plus two P2s. Destructive-control filter (was too narrow AND too lenient): - Applies to EVERY crawled clickable candidate again (buttons, links, role=button, [onclick], [data-testid], li/tr) — the previous action- control-only scoping let a <div data-testid onclick>Delete account</div> into the inventory - isDestructiveLabel is a plain lexicon match again (underscores normalized to spaces so \b fires at seams) — 'Delete-all' and 'reset_config' are excluded once more - The only carve-out: a passive content container (li/tr, or a data-testid box that isn't a button/link/input and has no role=button/onclick) whose destructive word is purely part of a lowercase identifier slug (checkout-api, delete-log-2024) is kept, so dashboards stay filmable. A real destructive control is never kept. coerceSelector: only heals a valid-selector prefix when the remainder is the display-annotation shape (whitespace + [tag]); #cta-danger / #cta2 no longer silently rewrite to #cta and bypass the whitelist. Theme probe: takes the background of the largest viewport-covering element (biased to a full-bleed painted wrapper), so React/Next apps that paint the dark surface on #root/main instead of body read as dark. Music provenance: tools/synth-music.py now regenerates the committed beds end-to-end (synthesize WAV -> ffmpeg loop+loudnorm+mp3); CREDITS documents the numpy/scipy/ffmpeg deps. Committed MP3s untouched. Also widened the record.e2e reproducibility tolerance 150ms -> 250ms (events ride the observed wall clock; 150 overshot by <1ms under parallel-suite load). Tests 164 -> 165. * Harden per adversarial review: destructive filter, LLM egress, coercion, SSRF Addresses the findings from the branch roast (.roast/REPORT-latest.md). Destructive-control filter (M1) — no longer infers non-interactivity from the onclick ATTRIBUTE (framework handlers bound via addEventListener leave it null). An element is a passive content container only when it has NO interactivity signal at all: not a button/link/input tag, no interactive role, no onclick attr, computed cursor != pointer, and tabIndex < 0. Any signal + a destructive-lexicon hit is excluded, so a <div data-testid onClick=… style=cursor:pointer>delete-worker</div> is now kept out while a genuinely passive 'checkout-api' display row stays filmable. LLM egress redaction (M2) — page URLs, titles, and headings now pass through redactForPrompt in both the analyze and script prompts (they were egressed raw). A ?session=<jwt> / ?token=<key> in a crawled URL is redacted before it reaches the provider; app_url and money-moment page_urls are redacted too so the fix can't be bypassed within the same prompt. Relative-URL coercion (M4) — validateAnalysis no longer silently rewrites a bare relative page_url onto the wrong page when two crawled URLs share a pathname but differ by query string; it throws a corrective error listing the candidates. Unique pathnames still coerce. Also: dropped 'publish' from the default destructive lexicon so CMS 'Publish' hero moments film by default (M5); '::' unspecified IPv6 now treated as private (L1); pickMusic cli branch degrades to silent instead of throwing post-spend (L2); vision retries no longer resend the full image payload — images go on attempt 0 only (L3); theme probe skips translucent overlays so a modal backdrop can't misread a light app as dark (L5); CI now runs the Node 20 engine floor alongside 22. Added the roast's must-have tests (SSRF ::/mapped-IPv6/metadata, coerceSelector never-heals-into-non-whitelist invariant, framework-onClick destructive exclusion, query-URL redaction, ambiguous-pathname refusal). Tests 165 -> 176. Verified live: generate against a dashboard still inventories 8 elements, tells a 3-beat story, and auto-picks music. * Make the record-reproducibility timing check robust to CI scheduling jitter Events ride the observed wall clock since clock unification, so exact per-event timestamps are deliberately not reproducible — only structure and geometry are (asserted separately as the hard invariant). The old per-event |Δ| <= 250ms bound was testing wall jitter, not a real contract, and a single scheduling hiccup on a contended CI runner spiked one event past it (150ms, then 250ms, both overshot by ~10ms). Replaced with a mean-per-event-drift <= 150ms check: robust to lone outliers, still catches gross desync. * Widen the reproducibility timing guard to a gross-divergence bound (400ms mean) CI's SwiftShader/2-core runner shows ~170ms mean per-event wall-clock drift between identical seeded runs vs ~10ms locally — inherent to the observed-clock event stamping, not a regression. The 150ms mean bound was still too tight for that environment. 400ms only trips if the two runs diverge catastrophically; structural/geometric identity remains the real reproducibility assertion. * Close the two blocking review gaps: unprovable destructive keep, URL-key redaction Destructive filter — remove the slug-keep exception entirely. There is no reliable way to prove an element has no click handler from page context (addEventListener bindings are invisible to the DOM; getEventListeners is devtools-only), so a <div data-testid>delete-worker</div> wired via addEventListener with no cursor:pointer/tabindex/role/onclick could still survive the interactivity heuristic. Now ANY element whose label hits the destructive lexicon is excluded, full stop — a passive row that merely shares a name with a verb ('checkout-api', 'delete-log-2024') is excluded too; --allow-destructive re-includes them. Dropped CONTENT_SLUG_RE, withoutSlugTokens, INTERACTIVE_ROLES and the cursor/tabindex probe. Verified live: the real Pulse dashboard still yields 7 filmable elements and a 3-beat story with only the checkout-api rows excluded. URL redaction — page URLs are validation KEYS (scene.entry.url must round-trip exactly against the raw crawled URL), so redacting them in the prompt broke the recipe gate on any token-bearing URL and the retry feedback re-leaked the raw URL. Revert URL redaction in the analyze/script prompts (titles, headings, and element text stay redacted — they're display-only). Instead, drop any page whose settled URL itself carries a secret at crawl time (new pageUrlHasSecret) with a clear warning, and fail the run with a plain message if the target URL is itself a credential — so a token-URL never reaches the prompt to leak, and normal URLs round-trip unchanged. Tests 176 -> 178 (pageUrlHasSecret unit test, credential-URL crawl refusal e2e, framework-onClick + passive-slug exclusion, URL round-trip). Live generate verified.
1 parent 6f031b6 commit bd47f71

25 files changed

Lines changed: 1343 additions & 137 deletions

.github/workflows/ci.yml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,16 @@ permissions:
1111
jobs:
1212
typecheck-and-unit:
1313
runs-on: ubuntu-latest
14+
strategy:
15+
matrix:
16+
# exercise the advertised engines floor (>=20) alongside current LTS, so
17+
# a Node-20-incompatible API can't slip in against the stated support
18+
node-version: [20, 22]
1419
steps:
1520
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
1621
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4.4.0
1722
with:
18-
node-version: 22
23+
node-version: ${{ matrix.node-version }}
1924
cache: npm
2025
- run: npm ci
2126
- run: npm run typecheck

README.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,8 @@ Procedural palettes (generated at render time, no asset): `aurora`, `midnight`,
203203

204204
## 🎵 Music
205205

206-
Videos are silent by default. `--music` (on `render` and `generate`) muxes a looped,
206+
`render` is silent by default; on `generate` the AI director picks the bundled track
207+
matching your app's look. `--music` (on `render` and `generate`) muxes a looped,
207208
loudness-normalized track with fade-in/out under the video — never re-encoding the
208209
video and never changing its length:
209210

@@ -216,14 +217,15 @@ supercut render --take out/take --music path/to/your-track.mp3 # your own fi
216217
Bundled tracks (in `assets/music/` — original instrumentals made for supercut;
217218
provenance in `assets/music/CREDITS.md`):
218219

219-
| track | vibe |
220-
| ---------- | ------------ |
221-
| `pulse` | minimal-tech |
222-
| `daybreak` | warm |
223-
| `midnight` | cinematic |
224-
| `momentum` | energetic |
220+
| track | vibe |
221+
| ---------- | ----------------------- |
222+
| `pulse` | minimal tech-house |
223+
| `daybreak` | bright melodic house |
224+
| `midnight` | dark synthwave/techno |
225+
| `momentum` | driving minimal techno |
225226

226-
`--music off` (or omitting the flag) keeps the video silent.
227+
`--music off` forces a silent cut; on `render`, omitting the flag does too. `--music`
228+
always outranks the director's pick on `generate`.
227229

228230
## 🔒 Privacy
229231

assets/music/CREDITS.md

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,30 @@
11
# Bundled music — provenance & license
22

3-
All four tracks are **original instrumental works produced for supercut**.
4-
They were generated with MiniMax Music 2.5 from original style prompts written
5-
for this project, then post-processed with ffmpeg (crossfade extension to
6-
~90–110s beds and loudness normalization to −16 LUFS). No pre-existing songs,
7-
samples, or melodies were referenced or used as input.
3+
All four tracks are **original instrumental works produced for supercut**,
4+
synthesized from pure oscillators (sub bass, drum machine, filtered noise
5+
hats, saw-wave pads, arpeggios) by `tools/synth-music.py` — there is no vocal
6+
source, no sample, and no pre-existing song anywhere in the signal chain, so
7+
the tracks cannot contain vocals.
8+
9+
**Reproducing the beds end-to-end.** `tools/synth-music.py` regenerates
10+
equivalent beds (not necessarily bit-identical) from scratch. It needs:
11+
12+
* Python packages `numpy` and `scipy``pip install numpy scipy`
13+
* `ffmpeg` on your PATH
14+
15+
Run `python3 tools/synth-music.py assets/music`. For each mood it synthesizes a
16+
short WAV loop, then ffmpeg self-crossfades that loop to ~92s
17+
(`acrossfade=d=1` ×3 → `atrim=0:92`), loudness-normalizes it
18+
(`loudnorm=I=-15:TP=-1.5:LRA=9`), and encodes a 192 kbit/s 44.1 kHz stereo MP3 —
19+
the exact pipeline that produced the checked-in beds.
820

921
To the extent the maintainers hold any rights in these recordings, they are
1022
dedicated to the public domain under [CC0 1.0](https://creativecommons.org/publicdomain/zero/1.0/).
1123
Use them in your videos — commercial or not — with no attribution required.
1224

13-
| track | vibe | length | bpm |
14-
| -------------- | ---------------------------- | ------ | ---- |
15-
| `pulse.mp3` | minimal tech, sleek | 100s | ~104 |
16-
| `daybreak.mp3` | warm piano, optimistic | 93s | ~92 |
17-
| `midnight.mp3` | cinematic ambient, premium | 110s | ~80 |
18-
| `momentum.mp3` | driving electronic, punchy | 100s | ~122 |
25+
| track | vibe | length | bpm |
26+
| -------------- | ----------------------------- | ------ | ---- |
27+
| `pulse.mp3` | minimal tech-house, sleek | 95s | ~104 |
28+
| `daybreak.mp3` | bright melodic house, upbeat | 95s | ~110 |
29+
| `midnight.mp3` | dark synthwave/techno, premium| 100s | ~100 |
30+
| `momentum.mp3` | driving minimal techno | 95s | ~122 |

assets/music/daybreak.mp3

-29.2 KB
Binary file not shown.

assets/music/midnight.mp3

-423 KB
Binary file not shown.

assets/music/momentum.mp3

-215 KB
Binary file not shown.

assets/music/pulse.mp3

-189 KB
Binary file not shown.

examples/demo.recipe.json

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,29 @@
11
{
22
"version": 0,
33
"app_url": "http://127.0.0.1:4173",
4-
"music_track": "institutional-01",
4+
"music_track": "daybreak",
55
"scenes": [
66
{
7-
"name": "landing-cta",
7+
"name": "the-pitch-and-signup",
88
"priority": 1,
99
"entry": { "url": "http://127.0.0.1:4173/", "prelude": [] },
1010
"depends_on": [],
1111
"actions": [
12-
{ "kind": "click", "selector": "#cta", "duration_ms": 1800 },
13-
{ "kind": "type", "selector": "#email", "text": "ada@lumon.dev", "duration_ms": 2200 },
14-
{ "kind": "click", "selector": "#join", "duration_ms": 1400 }
12+
{ "kind": "click", "selector": "#cta", "duration_ms": 1400 },
13+
{ "kind": "type", "selector": "#email", "text": "ada@lumon.dev", "duration_ms": 1900 },
14+
{ "kind": "click", "selector": "#join", "focus_selector": "#signup", "duration_ms": 1500 }
1515
],
16-
"hold_ms": 800
16+
"hold_ms": 900
1717
},
1818
{
19-
"name": "dashboard",
19+
"name": "the-live-product",
2020
"priority": 2,
21-
"entry": { "url": "http://127.0.0.1:4173/dash", "prelude": [] },
21+
"entry": { "url": "http://127.0.0.1:4173/dash/", "prelude": [] },
2222
"depends_on": [],
2323
"actions": [
24-
{ "kind": "hover", "selector": "#task-ship", "duration_ms": 1600 },
25-
{ "kind": "wait", "duration_ms": 1200 }
24+
{ "kind": "hover", "selector": "#task-ship", "focus_selector": "#tasks", "duration_ms": 1600 }
2625
],
27-
"hold_ms": 600
26+
"hold_ms": 1000
2827
}
2928
]
3029
}

examples/pandora-demo.recipe.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"version": 0,
33
"app_url": "http://127.0.0.1:8455",
4-
"music_track": "institutional-01",
4+
"music_track": "pulse",
55
"scenes": [
66
{
77
"name": "trace-one-company",

examples/pulse-demo.recipe.json

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
{
2+
"version": 0,
3+
"app_url": "http://127.0.0.1:4100",
4+
"music_track": "midnight",
5+
"scenes": [
6+
{
7+
"name": "triage-the-fleet",
8+
"priority": 1,
9+
"entry": { "url": "http://127.0.0.1:4100/", "prelude": [] },
10+
"depends_on": [],
11+
"actions": [
12+
{ "kind": "click", "selector": "[data-testid=\"service-search\"]", "duration_ms": 900 },
13+
{ "kind": "type", "selector": "[data-testid=\"service-search\"]", "text": "payments", "submit": true, "focus_selector": "[data-testid=\"metrics-panel\"]", "duration_ms": 1700 },
14+
{ "kind": "click", "selector": ":nth-match([data-testid=\"service-item\"], 1)", "focus_selector": "[data-testid=\"kpi-row\"]", "duration_ms": 1500 },
15+
{ "kind": "click", "selector": ":nth-match([data-testid=\"service-item\"], 3)", "focus_selector": "[data-testid=\"metrics-panel\"]", "duration_ms": 1500 }
16+
],
17+
"hold_ms": 900
18+
}
19+
]
20+
}

0 commit comments

Comments
 (0)