Automated workflow for analyzing, looking up metadata, and renaming ripped DVD/Blu-ray content.
# Automated workflow (recommended)
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video"The orchestration script runs all stages automatically with interactive approval.
Stage 1: Analyze → Stage 2: Lookup → Stage 3: Review → Stage 4: Rename
(Metadata-First) (Conditional APIs) (User Confirm) (TV/Movie)
Execution Options:
Start-MediaProcessing.ps1(recommended): Automated orchestration through all stages- Manual stages (
Scripts/1-*.ps1,Scripts/2-*.ps1, etc.): Fine-grained control for troubleshooting
The workflow uses a metadata-first approach to minimize external API calls and improve performance:
Process:
-
Scan embedded metadata using ffprobe (FFmpeg tool)
- Extracts TITLE tag from MKV/MP4 files
- DVDFab and similar rippers often embed proper episode info during ripping
- Example embedded TITLE:
"Babylon 5 - s02e12 - Acts of Sacrifice"
-
Parse episode information from TITLE tags
- Matches patterns:
"Show - S##E## - Title","Show s##e## Title","Show - ##x## - Title" - Extracts: Series name, season, episode number, episode title
- Validates format using regex patterns
- Matches patterns:
-
Decision logic:
- All files complete: Create mapping directly from embedded metadata, skip Claude AI entirely
- Mixed metadata: Use hybrid approach (see below)
- No usable metadata: Fall back to full Claude AI analysis
When some files have embedded metadata and others don't:
- Extract metadata from files with valid TITLE tags
- Build context: series name, season, episode list from complete files
- Send only incomplete files to Claude with context (reduced prompt size by 50-90%)
- Merge results: embedded metadata + Claude inference
- Mark directories as
metadata_source: "hybrid"
Example (Babylon 5 corpus):
- 42 files: Complete embedded metadata → processed directly
- 2 files: Malformed TITLE tags → sent to Claude with context from the 42 complete files
- Result: 57% smaller Claude prompt, 95% fewer API calls
- Skip API calls for files with complete embedded metadata
- Only query TVMaze/TMDb for files marked as
[Needs Lookup] - Typical savings: 80-100% reduction in API calls for DVDFab rips
For typical DVDFab rips with embedded metadata:
- Claude API calls: 0-1 (vs. always 1 previously)
- Claude prompt size: Reduced by 50-90% when using hybrid mode
- TVMaze/TMDb calls: 0 (vs. 1 per episode previously)
- Stage 1 execution: ~2-5 seconds (vs. 10-30 seconds)
- Overall workflow: ~15-30 seconds (vs. 1-2 minutes)
ffprobe must be in PATH for embedded metadata extraction:
- Install FFmpeg: https://ffmpeg.org/download.html
- Windows: Use winget, chocolatey, or manual download
winget install Gyan.FFmpeg(recommended)
- Verify installation:
ffprobe -version - If not available: Workflow automatically falls back to Claude AI analysis
Troubleshooting:
| Issue | Solution |
|---|---|
ffprobe not found in PATH |
Add FFmpeg bin directory to system PATH, restart terminal |
No embedded metadata found |
DVDFab didn't embed TITLE tags - workflow uses Claude AI fallback |
Malformed TITLE tags |
Hybrid mode sends these to Claude with context from complete files |
MediaProcessing/
├── Start-MediaProcessing.ps1 # Main orchestration script (runs all stages)
├── Test-FullWorkflow.ps1 # Testing script (dry-run mode)
├── README.md # This file
│
├── Config/ # Configuration files
│ ├── claude-pricing.json # Claude API pricing configuration
│ └── config.xml # TMDb API key (gitignored, created by Setup-ApiKey.ps1)
│
├── Modules/ # PowerShell modules
│ ├── MediaProcessing-Common.psm1 # Shared functions (API calls, file ops, etc.)
│ └── MediaProcessing-Logging.psm1 # Logging infrastructure
│
├── Scripts/ # Workflow stage scripts
│ ├── 1-Analyze-RippedMedia.ps1 # Stage 1: AI-powered content analysis
│ ├── 2-Lookup-Metadata.ps1 # Stage 2: Online metadata lookup
│ ├── 3-Review-Metadata.ps1 # Stage 3: User review interface
│ ├── 4-Rename-TVSeries.ps1 # Stage 4a: TV series renaming
│ └── 4-Rename-Movies.ps1 # Stage 4b: Movie renaming
│
├── Tools/ # Setup & maintenance utilities
│ ├── Setup-ApiKey.ps1 # Configure TMDb API key
│ ├── Setup-Pricing.ps1 # Update Claude pricing configuration
│ └── Cleanup-OldMappings.ps1 # Remove old mapping files
│
├── Docs/ # Supporting documentation
│ ├── FUTURE-ENHANCEMENTS.md # Enhancement tracking
│ └── SESSION-SUMMARY-*.md # Session notes
│
├── Prompts/ # Claude Code prompt templates
│ └── analyze-prompt.txt # Stage 1 AI prompt
│
├── Output/ # Timestamped JSON mappings (gitignored)
├── Cache/ # API response cache (gitignored)
└── Logs/ # Daily log files (gitignored)
The workflow uses claude-pricing.json to calculate accurate API costs. The pricing file is automatically checked for staleness (>30 days old).
Pricing File Location: MediaProcessing/Config/claude-pricing.json
Update Pricing:
.\Tools\Setup-Pricing.ps1The script will:
- Show current pricing and last update date
- Prompt for new pricing values (or press Enter to keep defaults)
- Update the
last_updateddate to today - Save the updated configuration
When to Update:
- The pricing file is more than 30 days old (Stage 1 will display warning)
- Claude announces pricing changes (check https://www.anthropic.com/pricing)
- The pricing file is missing or corrupted
Current Pricing (as of Jan 2026):
- Haiku: $0.80/M input, $4.00/M output (fast, simple tasks)
- Sonnet: $3.00/M input, $15.00/M output (balanced, default)
- Opus: $15.00/M input, $75.00/M output (complex reasoning)
Purpose: Extract embedded metadata and analyze directory structure to classify content type (TV series, movies, etc.).
Process:
- Scan for embedded metadata using ffprobe (if available)
- Parse TITLE tags to extract episode information
- For files with complete metadata: Create mapping directly (skip Claude AI)
- For files needing inference: Call Claude AI with context from complete files (hybrid mode)
- Generate timestamped JSON mapping file
Input: Root directory containing ripped media files Output: JSON mapping file with content analysis and metadata sources Dependencies:
- Optional: FFmpeg (ffprobe) for embedded metadata extraction (recommended)
- Claude Code (only called when metadata is incomplete)
.\Scripts\1-Analyze-RippedMedia.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video"What to expect:
- Best case: "All files have complete embedded metadata - skipping Claude AI" (0 API calls)
- Hybrid case: "Calling Claude AI for X directories with incomplete files" (reduced prompt)
- Fallback: Full Claude AI analysis (when ffprobe unavailable or no embedded metadata)
Purpose: Conditionally enrich analysis with online metadata for files that need it
Process:
- Skip files with
metadata_source: "embedded"(already complete) - Check local cache for previously fetched API responses
- Query TVMaze/TMDb APIs only for uncached files marked
[Needs Lookup] - Cache API responses for future use (30-day TTL)
- Display efficiency statistics (API calls saved + cache performance)
Input: JSON mapping from Stage 1 Output: Enhanced JSON mapping with metadata Dependencies: Internet connection, TMDb API key (free - only if TVMaze insufficient)
What to expect:
- Files with embedded metadata are skipped (0 API calls for those files)
- First run: Cache misses, fetches from APIs, caches responses
- Subsequent runs: Cache hits, 90-100% faster (no external API calls)
- Efficiency gains displayed: "API calls saved: X (50%+ reduction)"
- Cache statistics: "Cache hits: X, Hit rate: Y%"
Stage 2 automatically caches API responses to dramatically speed up repeat workflow runs:
How It Works:
- TVMaze and TMDb API responses are cached locally in
Cache/directory - Each show/movie has a unique cache key (e.g.,
tvmaze-Babylon-5,tmdb-tv-Babylon-5-s1) - Cache entries expire after 30 days (configurable TTL)
- Cache is checked before making any external API calls
Performance Benefits:
- First run: Normal API calls, responses cached for future use
- Repeat runs: 90-100% faster, zero external API calls for cached shows
- Hit rate tracking: See cache effectiveness in real-time
Cache Management:
# View cache directory
Get-ChildItem .\Cache
# Clear all cache (force fresh lookups)
Remove-Item .\Cache\*.json
# Clear specific show cache
Remove-Item .\Cache\tvmaze-Babylon-5.jsonCache Statistics Example:
Cache Performance:
Cache hits: 12
Cache misses: 0
Hit rate: 100%
External API calls saved by cache: 12
- Get free TMDb API key: https://www.themoviedb.org/settings/api
- Configure it securely (never committed to git):
.\Tools\Setup-ApiKey.ps1 -TMDbApiKey "your-api-key-here".\Scripts\2-Lookup-Metadata.ps1 -MappingFile ".\Output\mapping-20260110-153045.json"How it works:
- Tries TVMaze API first (no key needed, but limited coverage)
- Falls back to TMDb API for comprehensive TV and movie metadata
- API key loaded automatically from
config.xml(gitignored)
API Key Priority:
-TMDbApiKeyparameter (if provided)config.xmlfile (recommended, auto-loaded)$env:TMDB_API_KEYenvironment variable
Purpose: Review proposed changes and confirm before renaming
Input: Enhanced JSON mapping from Stage 2 Output: User-approved JSON mapping
.\Scripts\3-Review-Metadata.ps1 -MappingFile ".\Output\mapping-20260110-153045.json"Purpose: Execute file renaming based on approved mapping
Input: Approved JSON mapping from Stage 3 Output: Renamed files
# TV Series
.\Scripts\4-Rename-TVSeries.ps1 -MappingFile ".\Output\mapping-20260110-153045.json"
# Movies
.\Scripts\4-Rename-Movies.ps1 -MappingFile ".\Output\mapping-20260110-153045.json"Remove mapping files older than specified days (default: 30 days)
.\Tools\Cleanup-OldMappings.ps1 -DaysToKeep 30See Prompts/analyze-prompt.txt for the expected JSON schema.
Use the orchestration script to run all stages automatically:
# Complete workflow with interactive approval
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video"
# Preview mode (see what would be renamed without making changes)
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video" -PreviewOnly
# Skip metadata lookup if files have complete embedded metadata
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video" -SkipLookup
# Pause between stages for review
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video" -PauseBetweenStages
# Provide TMDb API key directly
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video" -TMDbApiKey "your-key"Features:
- Runs all 4 stages automatically
- Interactive review and approval in Stage 3
- Auto-detects TV series vs. movies and calls appropriate rename scripts
- Comprehensive workflow summary with performance metrics
- Preserves mapping files for reference
- Supports
-PreviewOnlyfor safe preview runs (dry run mode) - Error handling with graceful workflow abort
When to Use:
- Orchestration script: When you want a streamlined, automated workflow
- Manual stages: When you need fine-grained control or want to troubleshoot individual stages
For fine-grained control, run each stage individually:
# Step 1: Analyze
$mapping = .\Scripts\1-Analyze-RippedMedia.ps1 -SourcePath "C:\Rips\Babylon5"
# Step 2: Lookup metadata
.\Scripts\2-Lookup-Metadata.ps1 -MappingFile $mapping
# Step 3: Review and confirm
.\Scripts\3-Review-Metadata.ps1 -MappingFile $mapping
# Step 4: Rename
.\Scripts\4-Rename-TVSeries.ps1 -MappingFile $mapping # For TV series
.\Scripts\4-Rename-Movies.ps1 -MappingFile $mapping # For movies- Mapping files are automatically timestamped and stored in
Output/ - Use
-WhatIfon Stage 4 scripts to preview changes without renaming - Mapping files older than 30 days can be cleaned up with
Tools\Cleanup-OldMappings.ps1