Skip to content

mjmeli/vacation-video-generator

Repository files navigation

Vacation Recap Video Generator

A native Windows desktop app that turns a folder of vacation photos and videos into a finished MP4 recap — without opening a video editor.

The edit recipe is deliberately simple: photos and videos in chronological order, 1-second cross-dissolves, an opening title card, optional per-section title cards, optional captions, and background music ducked under video audio with a side-chain compressor. That workflow is fully scriptable; this app replaces the manual editor with curation, trim, and one-click render.

Features

  • Native desktop UI — PyWebView window with real OS file dialogs; media stays on disk (no uploads, no Docker).
  • Photo curation — import files or folders, Lightroom star ratings, minimum-rating filter, per-photo include/caption.
  • Video trimming — in/out points with keyboard shortcuts (I/O, J/L, frame step) and HTML5 preview via byte-range streaming.
  • Section markers — inject title cards at any point on the chronological timeline (e.g. "Saguaro National Park / Stop 1").
  • Music — multi-track playlist, drag-to-reorder, waveform trimmer, silence auto-detect, cross-fade between songs, length advisor.
  • Order — chronological timeline view of all included photos and videos; edit or bulk-nudge capture timestamps to fix ordering before render.
  • Settings — 4K/1080p, photo duration, cross-fade, optional Ken Burns, duck level, music cross-fade, codec (H.264/H.265), encoder selection (auto/NVENC/AMF/QSV/software), quality tier.
  • Output — H.264 or H.265/HEVC MP4 with all effects baked in.
  • Project filesMyTrip.recap.json autosaves selections, trims, and settings; recent projects on the start screen.

How it works

  1. Create or open a .recap.json project.
  2. Add photos, videos, and music from wherever they live on disk (C:\Pictures\..., D:\Camera\..., etc.).
  3. Curate: ratings, trims, section markers, captions, title text.
  4. Generate — pick an output path; the engine builds a timeline, renders overlays with Pillow, and encodes the final MP4 with FFmpeg.

Estimated recap length updates live: title duration + photos × photo duration + trimmed video lengths − crossfade overlap. Music loops the last track if short; fades out if long.


Requirements

End users

Windows 10/11. Download VacationRecap.exe from GitHub Releases (when published). No Python, FFmpeg, or Node install needed.

Developers

Two tools to install globally — that's it:

Tool Notes
uv Python version manager + package manager. Installs Python 3.13 automatically from .python-version.
fnm Node version manager. Installs Node.js automatically from .node-version.

Python 3.13 and Node.js 22 LTS are not manual installs — uv and fnm handle them when you run .\bootstrap.ps1.

FFmpeg and ExifTool require no system install either. imageio-ffmpeg ships an FFmpeg binary inside the Python venv. ffprobe and ExifTool are downloaded into the gitignored vendor\ directory by bootstrap.ps1.


Quick start (Windows)

This is a native Windows app — develop and run it directly on Windows (no Docker). The toolchain is two single-binary tools that install in one line each and keep their state in their own per-user dirs:

# 1. Install uv (Python toolchain) and fnm (Node version manager) — one time, global
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
winget install Schniz.fnm
# (restart the terminal so both land on PATH)

# 2. One-time project setup: Python + Node deps + vendor binaries
.\bootstrap.ps1

# 3. Day-to-day dev loop (backend + Vite + native window, all hot-reloading)
.\dev.ps1

bootstrap.ps1 runs uv sync, installs Node per .node-version, and downloads the vendored ffprobe.exe + ExifTool into vendor\. dev.ps1 then starts the FastAPI backend, the Vite dev server, and the native PyWebView window (pointed at Vite via RECAP_DEV_URL, so OS file dialogs and Svelte HMR both work). Closing the window stops everything.

The sections below cover the same steps in more detail, plus testing and the release build.


First-time setup

1. Install uv and fnm (one time, global)

# uv — Python toolchain (installs Python 3.13 + deps, no system Python needed)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# fnm — Node version manager
winget install Schniz.fnm

Restart your terminal, then verify: uv --version and fnm --version. Both are single self-contained binaries; uninstalling later is just removing the binary + its per-user cache dir.

2. Clone and bootstrap

git clone https://github.com/YOUR_USERNAME/vacation-video-generator.git
cd vacation-video-generator

.\bootstrap.ps1

bootstrap.ps1 does everything else:

  • uv sync --extra dev — reads .python-version (3.13) + pyproject.toml, creates .venv/ with all runtime + dev deps. No pip, no manual venv activation (uv run handles it).
  • fnm use --install-if-missing (reads .node-version) + npm install in frontend/.
  • Downloads the vendored ffprobe.exe and ExifTool into vendor/ (gitignored). These run both in dev (via FFPROBE_PATH/EXIFTOOL_PATH, set by dev.ps1) and in the bundled .exe. FFmpeg itself ships inside imageio-ffmpeg — nothing to install.

To re-fetch the vendor binaries on their own: uv run python build/vendor_ffprobe.py and uv run python build/vendor_exiftool.py.


Development

The dev loop: .\dev.ps1

.\dev.ps1

This is the everyday workflow. It launches three processes and wires them together:

  1. Backendwatchfiles runs python -m uvicorn backend.app:create_app --factory --host 127.0.0.1 --port 8000, restarting it on Python edits. (dev.ps1 uses the watchfiles CLI rather than uvicorn --reload: uvicorn's own Windows reloader sends a console Ctrl+C that the OS broadcasts to every process sharing the terminal, which would also tear down the native window.)
  2. Vite — the dev server on http://localhost:5173 with HMR, proxying /api127.0.0.1:8000.
  3. Native window (python -m desktop.main) — the PyWebView shell, pointed at the Vite server via RECAP_DEV_URL.

Both sides hot-reload: Svelte via Vite HMR, Python via watchfiles. Native file dialogs work because they run in the window's process through the PyWebView JS bridge (window.pywebview.api, wired up in desktop/main.py) — independent of which process serves the page. Closing the window tears down the backend and Vite.

Running pieces individually

dev.ps1 is just orchestration (it sets FFPROBE_PATH, EXIFTOOL_PATH, and RECAP_DEV_URL for you). To run by hand:

# Backend, fixed port for Vite's proxy (one terminal)
$env:FFPROBE_PATH  = "$PWD\vendor\ffprobe\ffprobe.exe"
$env:EXIFTOOL_PATH = "$PWD\vendor\exiftool\exiftool.exe"
uv run uvicorn backend.app:create_app --factory --port 8000 --reload

# Vite (another terminal)
cd frontend; npm run dev

# The native window, pointed at the Vite server (a third terminal)
$env:RECAP_DEV_URL = "http://localhost:5173"; uv run python -m desktop.main

Window URL: with RECAP_DEV_URL set, the shell skips its in-process backend and loads that URL. Without it (the production/exe path), the shell starts FastAPI on a random free port and serves the built SPA from frontend/dist/ — run npm run build first in that case.

Browser-direct (http://localhost:5173 in a normal browser) is fine for pure layout/CSS work, but file dialogs and other native features only exist inside the PyWebView window (the window.pywebview.api bridge is absent in a plain browser, so pickers resolve to empty).

Build the frontend (required before running via PyWebView)

cd frontend
npm run build

Output goes to frontend/dist/. The PyWebView window (and PyInstaller bundle) serve from this directory.


Testing

uv run pytest

Runs all tests under tests/. The suite covers the timeline model, xfade offset math, Ken Burns keyframe generation, EXIF metadata scanning, playback-speed handling, and the two-stage render pipeline / encoder registry.

uv run pytest -v               # verbose

End-to-end smoke test

The smoke test (tests/test_e2e_smoke.py) builds a real project from committed assets in tests/test-media/, runs the full FFmpeg render pipeline, and checks that test_render.mp4 is created and valid. Use -s so progress lines from the render print to the terminal.

.\smoke.ps1
# equivalent to: uv run pytest tests/test_e2e_smoke.py -v -s

Output is written to tests/test-output/ (test_render.mp4; gitignored). The test uses 1080p settings for a faster encode; allow a few minutes on first run.

Iterating on the assembly step without re-rendering

The render has two stages: stage 1 encodes each segment independently in parallel (the slow part, ~1–2 h for a full project); stage 2 assembles the intermediates with xfade, audio mixing, and final encode (fast, seconds to minutes).

When debugging stage 2 — tweaking audio levels, encoder settings, transitions, etc. — set VRG_REUSE_INTERMEDIATES=1 before launching the app. Stage-1 intermediates are then saved to a stable directory keyed to the output path, and any already-rendered segment is skipped on the next run. The directory is printed to the log at startup.

$env:VRG_REUSE_INTERMEDIATES = "1"
.\dev.ps1   # or however you launch the app

The first run with the flag is still a full render (it builds and keeps the intermediates). Every run after that skips stage 1 and jumps straight to assembly.

Cache invalidation: the key is the output path only — it does not detect changes to resolution, fps, crossfade, trims, captions, or source media. If you change any render setting or clip, delete the printed vrg_inter_reuse_* directory (in %TEMP%) to force a full rebuild, or unset the variable.

To clear the reuse state: Remove-Item Env:\VRG_REUSE_INTERMEDIATES

Linting

uv run ruff check .
uv run ruff format --check .

Building the release executable

# 1. Build the Svelte SPA
cd frontend
npm run build
cd ..

# 2. Confirm vendor binaries exist (bootstrap.ps1 fetches these; or run
#    `uv run python build/vendor_ffprobe.py` and `build/vendor_exiftool.py`)

# 3. Run PyInstaller
uv run pyinstaller build/recap.spec

Output: dist/VacationRecap.exe (~180 MB one-file bundle containing Python 3.13, FFmpeg, ExifTool, WebView2 shims, and the built SPA).

build.ps1 wraps these three steps (vendor check → SPA build → PyInstaller) into one command, and .github/workflows/build-exe.yml runs the same build in CI on a tag push — the canonical release path.

First launch unpacks to a temp directory — a 1–2 s startup delay is normal.


Architecture

┌─────────────────────────────────────────────────────────────┐
│  desktop/main.py  — PyWebView window, OS file dialogs,      │
│                     File menu (New / Open / Recent)          │
└──────────────────────────┬──────────────────────────────────┘
                           │  http://127.0.0.1:<random port>
┌──────────────────────────▼──────────────────────────────────┐
│  backend/app.py  — FastAPI: REST + SSE + /media byte-range  │
│                    + static SPA serving                      │
├─────────────────────────────────────────────────────────────┤
│  engine.py        Timeline → FFmpeg filter graph → MP4      │
│  overlays.py      Pillow: title card, section cards,        │
│                           caption pill PNGs (hash-cached)   │
│  metadata.py      EXIF/XMP via ExifTool, ffprobe,           │
│                   thumbnail generation, hash cache          │
│  silence.py       ffmpeg silencedetect → music trim hints   │
│  models.py        Pydantic data model (Project, items, …)   │
└─────────────────────────────────────────────────────────────┘
         ↑ served as static files from frontend/dist/
┌─────────────────────────────────────────────────────────────┐
│  Svelte 4 SPA (frontend/src/)                               │
│    sections/  Start  Title  Photos  Videos  Order  Music    │
│               Sections  Settings  Generate                  │
│    components/ VideoTrim  AudioTrim                         │
│                PhotoLightbox  VideoLightbox                 │
│    api.ts      typed REST client                            │
│    store.ts    writable stores + autosave                   │
└─────────────────────────────────────────────────────────────┘

Engine pipeline

  1. Scan inputs — EXIF/XMP for photos (ExifTool), ffprobe for video/audio; results cached in memory and persisted in project JSON.
  2. Build timeline — selected, trimmed clips sorted chronologically; section markers interleaved at their declared positions.
  3. Render overlays — title cards, section cards (bracket frame + "Stop N" + hero blur background), caption pills — all PNGs via Pillow, content-hash cached.
  4. FFmpeg filter graph (two-stage, parallel) — stage 1 renders each segment independently in parallel (blur-pad, Ken Burns, caption overlay); stage 2 assembles intermediates with xfade chain, video-clip audio, sidechaincompress music ducking, and fade in/out. Encoder is pluggable: NVENC → AMF → QSV → software (auto-detected; quality tier and codec — H.264 or H.265/HEVC — selectable in Settings).

Reference output spec: 3840×2160, 30 fps (ntsc=FALSE), 1 s cross-dissolves, fade to black at end.

Key FFmpeg techniques

Effect Filter
Aspect-preserving blur pad split→scale-fill+boxblur bg / scale-fit fg → overlay
Ken Burns zoompan=z='1+t*s':x='...':y='...':d=<frames>:fps=<fps>
Cross-dissolve chain [v0][v1]xfade=fade:duration=1:offset=T (cumulative offsets)
Music ducking [music][va]sidechaincompress=threshold=…:ratio=4:attack=200:release=1000
Loudness normalisation loudnorm=I=-16:TP=-1.5:LRA=11 per clip + amix normalize=0 + alimiter
Silence detect ffmpeg -af silencedetect=noise=-50dB:d=0.3 -f null -

VideoTrim keyboard shortcuts

Key Action
Space Play / pause
I Set in-point to playhead
O Set out-point to playhead
/ Step ±1 frame
Shift+← / Shift+→ Step ±1 second
J / L Jump −10 s / +10 s

Project layout

desktop/
  main.py              PyWebView shell, JS API bridge
backend/
  app.py               FastAPI app (REST + SSE + static SPA)
  engine.py            Timeline → two-stage parallel FFmpeg render
  encoders.py          Pluggable encoder registry (NVENC/AMF/QSV/software, H.264+H.265)
  overlays.py          Pillow PNG renderers
  metadata.py          EXIF / XMP / ffprobe / thumbnail cache
  silence.py           ffmpeg silencedetect wrapper
  winjob.py            Kill-on-close job: ties ffmpeg children to backend lifetime (Windows)
  models.py            Pydantic project data model
frontend/
  src/
    App.svelte          Top-level layout + nav
    api.ts              Typed REST client
    store.ts            Svelte stores + autosave
    sections/           Start  Title  Photos  Videos  Order
                        Music  Sections  Settings  Generate
    components/
      VideoTrim.svelte    Canvas timeline + HTML5 video + keyboard
      AudioTrim.svelte    wavesurfer.js waveform + drag handles
      PhotoLightbox.svelte  Full-screen photo preview
      VideoLightbox.svelte  Full-screen video preview
  dist/                 Built SPA (committed or generated by npm run build)
build/
  recap.spec            PyInstaller spec
tests/
  test-media/           Committed assets for the e2e smoke test
  test-output/          Smoke-test render output (gitignored)
  test_models.py        Pydantic model round-trips
  test_metadata.py      EXIF/XMP scanning and ExifTool integration
  test_engine_speed.py  Playback-speed / atempo decomposition
  test_render_pipeline.py  Two-stage pipeline and encoder registry
vendor/
  exiftool/             ExifTool Windows exe (gitignored, fetched by bootstrap.ps1)
  ffprobe/              ffprobe.exe (gitignored, fetched by bootstrap.ps1)
pyproject.toml          Project metadata + uv/pip dependencies
uv.lock                 Locked dependency versions
.python-version         3.13 (read by uv)

Verification checklist

When validating a build end-to-end:

  1. Smoke test — 8 photos (mixed ratings), 4 videos, 2 music tracks. Rating ≥4 filter, video trim via keyboard, music silence auto-detect, title card, Generate → MP4 with cross-dissolves and ducked music, fade out.
  2. Ken Burns — render at strength 0, 0.3, and 1.0; confirm static / subtle / aggressive.
  3. Sections & captions — two markers, custom hero, subtitle override + ↺ restore, caption on a photo and on a video clip (First 3 s mode). Confirm section cards appear at right positions in MP4.
  4. Persistence — close and reopen .recap.json; all selections, trim points, and settings restored.
  5. Distribution — copy VacationRecap.exe to a fresh Windows machine with no dev tools; double-click; confirm end-to-end.

Removing the dev environment

Everything installed by bootstrap.ps1 is isolated and fully removable.

Project-local files (safe to delete at any time)

These are all gitignored. Deleting them has no effect on git history and bootstrap.ps1 will recreate them on next run.

# Python venv (recreated by: uv sync --extra dev)
Remove-Item -Recurse -Force .venv

# Vendored binaries (recreated by: .\bootstrap.ps1)
Remove-Item -Recurse -Force vendor

# Node modules (recreated by: npm --prefix frontend install)
Remove-Item -Recurse -Force frontend\node_modules

# Built SPA (recreated by: npm --prefix frontend run build)
Remove-Item -Recurse -Force frontend\dist

# Smoke-test render output
Remove-Item -Recurse -Force tests\test-output

App data (project files and thumbnail cache)

The app writes its cache and recent-projects list here — separate from the dev tools and not touched by uninstalling them:

%APPDATA%\VacationRecap\

Delete this folder to remove all cached thumbnails and the recent-projects list. Your .recap.json project files are wherever you saved them and are unaffected.


Out of scope (for now)

  • Beat-synced cuts to music (possible later via librosa)
  • Vision-model "best moment" selection inside clips
  • macOS / Linux distribution (code is largely portable; only Windows .exe is planned)

About

Turn vacation photos and videos into a finished MP4 recap — native Windows desktop app, local processing, no uploads.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors