Media Processing Scripts

Automated workflow for analyzing, looking up metadata, and renaming ripped DVD/Blu-ray content.

Quick Start

# Automated workflow (recommended)
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video"

The orchestration script runs all stages automatically with interactive approval.

Workflow Overview

Stage 1: Analyze     → Stage 2: Lookup    → Stage 3: Review    → Stage 4: Rename
(Metadata-First)       (Conditional APIs)    (User Confirm)       (TV/Movie)

Execution Options:

Start-MediaProcessing.ps1 (recommended): Automated orchestration through all stages
Manual stages (Scripts/1-*.ps1, Scripts/2-*.ps1, etc.): Fine-grained control for troubleshooting

Metadata Extraction Strategy

The workflow uses a metadata-first approach to minimize external API calls and improve performance:

Stage 1: Embedded Metadata Priority

Process:

Scan embedded metadata using ffprobe (FFmpeg tool)
- Extracts TITLE tag from MKV/MP4 files
- DVDFab and similar rippers often embed proper episode info during ripping
- Example embedded TITLE: "Babylon 5 - s02e12 - Acts of Sacrifice"
Parse episode information from TITLE tags
- Matches patterns: "Show - S##E## - Title", "Show s##e## Title", "Show - ##x## - Title"
- Extracts: Series name, season, episode number, episode title
- Validates format using regex patterns
Decision logic:
- All files complete: Create mapping directly from embedded metadata, skip Claude AI entirely
- Mixed metadata: Use hybrid approach (see below)
- No usable metadata: Fall back to full Claude AI analysis

Hybrid Approach (Mixed Metadata)

When some files have embedded metadata and others don't:

Extract metadata from files with valid TITLE tags
Build context: series name, season, episode list from complete files
Send only incomplete files to Claude with context (reduced prompt size by 50-90%)
Merge results: embedded metadata + Claude inference
Mark directories as metadata_source: "hybrid"

Example (Babylon 5 corpus):

42 files: Complete embedded metadata → processed directly
2 files: Malformed TITLE tags → sent to Claude with context from the 42 complete files
Result: 57% smaller Claude prompt, 95% fewer API calls

Stage 2: Conditional API Lookups

Skip API calls for files with complete embedded metadata
Only query TVMaze/TMDb for files marked as [Needs Lookup]
Typical savings: 80-100% reduction in API calls for DVDFab rips

Performance Benefits

For typical DVDFab rips with embedded metadata:

Claude API calls: 0-1 (vs. always 1 previously)
Claude prompt size: Reduced by 50-90% when using hybrid mode
TVMaze/TMDb calls: 0 (vs. 1 per episode previously)
Stage 1 execution: ~2-5 seconds (vs. 10-30 seconds)
Overall workflow: ~15-30 seconds (vs. 1-2 minutes)

Requirements

ffprobe must be in PATH for embedded metadata extraction:

Install FFmpeg: https://ffmpeg.org/download.html
- Windows: Use winget, chocolatey, or manual download
- winget install Gyan.FFmpeg (recommended)
Verify installation: ffprobe -version
If not available: Workflow automatically falls back to Claude AI analysis

Troubleshooting:

Issue	Solution
`ffprobe not found in PATH`	Add FFmpeg bin directory to system PATH, restart terminal
`No embedded metadata found`	DVDFab didn't embed TITLE tags - workflow uses Claude AI fallback
`Malformed TITLE tags`	Hybrid mode sends these to Claude with context from complete files

Directory Structure

MediaProcessing/
├── Start-MediaProcessing.ps1           # Main orchestration script (runs all stages)
├── Test-FullWorkflow.ps1               # Testing script (dry-run mode)
├── README.md                           # This file
│
├── Config/                             # Configuration files
│   ├── claude-pricing.json            # Claude API pricing configuration
│   └── config.xml                     # TMDb API key (gitignored, created by Setup-ApiKey.ps1)
│
├── Modules/                            # PowerShell modules
│   ├── MediaProcessing-Common.psm1    # Shared functions (API calls, file ops, etc.)
│   └── MediaProcessing-Logging.psm1   # Logging infrastructure
│
├── Scripts/                            # Workflow stage scripts
│   ├── 1-Analyze-RippedMedia.ps1      # Stage 1: AI-powered content analysis
│   ├── 2-Lookup-Metadata.ps1          # Stage 2: Online metadata lookup
│   ├── 3-Review-Metadata.ps1          # Stage 3: User review interface
│   ├── 4-Rename-TVSeries.ps1          # Stage 4a: TV series renaming
│   └── 4-Rename-Movies.ps1            # Stage 4b: Movie renaming
│
├── Tools/                              # Setup & maintenance utilities
│   ├── Setup-ApiKey.ps1               # Configure TMDb API key
│   ├── Setup-Pricing.ps1              # Update Claude pricing configuration
│   └── Cleanup-OldMappings.ps1        # Remove old mapping files
│
├── Docs/                               # Supporting documentation
│   ├── FUTURE-ENHANCEMENTS.md         # Enhancement tracking
│   └── SESSION-SUMMARY-*.md           # Session notes
│
├── Prompts/                            # Claude Code prompt templates
│   └── analyze-prompt.txt             # Stage 1 AI prompt
│
├── Output/                             # Timestamped JSON mappings (gitignored)
├── Cache/                              # API response cache (gitignored)
└── Logs/                               # Daily log files (gitignored)

Configuration

Claude API Pricing

The workflow uses claude-pricing.json to calculate accurate API costs. The pricing file is automatically checked for staleness (>30 days old).

Pricing File Location: MediaProcessing/Config/claude-pricing.json

Update Pricing:

.\Tools\Setup-Pricing.ps1

The script will:

Show current pricing and last update date
Prompt for new pricing values (or press Enter to keep defaults)
Update the last_updated date to today
Save the updated configuration

When to Update:

The pricing file is more than 30 days old (Stage 1 will display warning)
Claude announces pricing changes (check https://www.anthropic.com/pricing)
The pricing file is missing or corrupted

Current Pricing (as of Jan 2026):

Haiku: $0.80/M input, $4.00/M output (fast, simple tasks)
Sonnet: $3.00/M input, $15.00/M output (balanced, default)
Opus: $15.00/M input, $75.00/M output (complex reasoning)

Stage 1: Analyze Ripped Media

Purpose: Extract embedded metadata and analyze directory structure to classify content type (TV series, movies, etc.).

Process:

Scan for embedded metadata using ffprobe (if available)
Parse TITLE tags to extract episode information
For files with complete metadata: Create mapping directly (skip Claude AI)
For files needing inference: Call Claude AI with context from complete files (hybrid mode)
Generate timestamped JSON mapping file

Input: Root directory containing ripped media files Output: JSON mapping file with content analysis and metadata sources Dependencies:

Optional: FFmpeg (ffprobe) for embedded metadata extraction (recommended)
Claude Code (only called when metadata is incomplete)

.\Scripts\1-Analyze-RippedMedia.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video"

What to expect:

Best case: "All files have complete embedded metadata - skipping Claude AI" (0 API calls)
Hybrid case: "Calling Claude AI for X directories with incomplete files" (reduced prompt)
Fallback: Full Claude AI analysis (when ffprobe unavailable or no embedded metadata)

Stage 2: Lookup Metadata

Purpose: Conditionally enrich analysis with online metadata for files that need it

Process:

Skip files with metadata_source: "embedded" (already complete)
Check local cache for previously fetched API responses
Query TVMaze/TMDb APIs only for uncached files marked [Needs Lookup]
Cache API responses for future use (30-day TTL)
Display efficiency statistics (API calls saved + cache performance)

Input: JSON mapping from Stage 1 Output: Enhanced JSON mapping with metadata Dependencies: Internet connection, TMDb API key (free - only if TVMaze insufficient)

What to expect:

Files with embedded metadata are skipped (0 API calls for those files)
First run: Cache misses, fetches from APIs, caches responses
Subsequent runs: Cache hits, 90-100% faster (no external API calls)
Efficiency gains displayed: "API calls saved: X (50%+ reduction)"
Cache statistics: "Cache hits: X, Hit rate: Y%"

API Response Caching

Stage 2 automatically caches API responses to dramatically speed up repeat workflow runs:

How It Works:

TVMaze and TMDb API responses are cached locally in Cache/ directory
Each show/movie has a unique cache key (e.g., tvmaze-Babylon-5, tmdb-tv-Babylon-5-s1)
Cache entries expire after 30 days (configurable TTL)
Cache is checked before making any external API calls

Performance Benefits:

First run: Normal API calls, responses cached for future use
Repeat runs: 90-100% faster, zero external API calls for cached shows
Hit rate tracking: See cache effectiveness in real-time

Cache Management:

# View cache directory
Get-ChildItem .\Cache

# Clear all cache (force fresh lookups)
Remove-Item .\Cache\*.json

# Clear specific show cache
Remove-Item .\Cache\tvmaze-Babylon-5.json

Cache Statistics Example:

Cache Performance:
  Cache hits: 12
  Cache misses: 0
  Hit rate: 100%
  External API calls saved by cache: 12

Setup (One-time)

Get free TMDb API key: https://www.themoviedb.org/settings/api
Configure it securely (never committed to git):

.\Tools\Setup-ApiKey.ps1 -TMDbApiKey "your-api-key-here"

Usage

.\Scripts\2-Lookup-Metadata.ps1 -MappingFile ".\Output\mapping-20260110-153045.json"

How it works:

Tries TVMaze API first (no key needed, but limited coverage)
Falls back to TMDb API for comprehensive TV and movie metadata
API key loaded automatically from config.xml (gitignored)

API Key Priority:

-TMDbApiKey parameter (if provided)
config.xml file (recommended, auto-loaded)
$env:TMDB_API_KEY environment variable

Stage 3: Review Metadata

Purpose: Review proposed changes and confirm before renaming

Input: Enhanced JSON mapping from Stage 2 Output: User-approved JSON mapping

.\Scripts\3-Review-Metadata.ps1 -MappingFile ".\Output\mapping-20260110-153045.json"

Stage 4: Rename Files

Purpose: Execute file renaming based on approved mapping

Input: Approved JSON mapping from Stage 3 Output: Renamed files

# TV Series
.\Scripts\4-Rename-TVSeries.ps1 -MappingFile ".\Output\mapping-20260110-153045.json"

# Movies
.\Scripts\4-Rename-Movies.ps1 -MappingFile ".\Output\mapping-20260110-153045.json"

Cleanup Utility

Remove mapping files older than specified days (default: 30 days)

.\Tools\Cleanup-OldMappings.ps1 -DaysToKeep 30

JSON Mapping Format

See Prompts/analyze-prompt.txt for the expected JSON schema.

Common Workflows

Automated Orchestration (Recommended)

Use the orchestration script to run all stages automatically:

# Complete workflow with interactive approval
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video"

# Preview mode (see what would be renamed without making changes)
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video" -PreviewOnly

# Skip metadata lookup if files have complete embedded metadata
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video" -SkipLookup

# Pause between stages for review
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video" -PauseBetweenStages

# Provide TMDb API key directly
.\Start-MediaProcessing.ps1 -SourcePath "C:\DVDFab\DVDFab13\Output\Video" -TMDbApiKey "your-key"

Features:

Runs all 4 stages automatically
Interactive review and approval in Stage 3
Auto-detects TV series vs. movies and calls appropriate rename scripts
Comprehensive workflow summary with performance metrics
Preserves mapping files for reference
Supports -PreviewOnly for safe preview runs (dry run mode)
Error handling with graceful workflow abort

When to Use:

Orchestration script: When you want a streamlined, automated workflow
Manual stages: When you need fine-grained control or want to troubleshoot individual stages

Manual Stage-by-Stage Workflow

For fine-grained control, run each stage individually:

# Step 1: Analyze
$mapping = .\Scripts\1-Analyze-RippedMedia.ps1 -SourcePath "C:\Rips\Babylon5"

# Step 2: Lookup metadata
.\Scripts\2-Lookup-Metadata.ps1 -MappingFile $mapping

# Step 3: Review and confirm
.\Scripts\3-Review-Metadata.ps1 -MappingFile $mapping

# Step 4: Rename
.\Scripts\4-Rename-TVSeries.ps1 -MappingFile $mapping  # For TV series
.\Scripts\4-Rename-Movies.ps1 -MappingFile $mapping    # For movies

Notes

Mapping files are automatically timestamped and stored in Output/
Use -WhatIf on Stage 4 scripts to preview changes without renaming
Mapping files older than 30 days can be cleaned up with Tools\Cleanup-OldMappings.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Media Processing Scripts

Quick Start

Workflow Overview

Metadata Extraction Strategy

Stage 1: Embedded Metadata Priority

Hybrid Approach (Mixed Metadata)

Stage 2: Conditional API Lookups

Performance Benefits

Requirements

Directory Structure

Configuration

Claude API Pricing

Stage 1: Analyze Ripped Media

Stage 2: Lookup Metadata

API Response Caching

Setup (One-time)

Usage

Stage 3: Review Metadata

Stage 4: Rename Files

Cleanup Utility

JSON Mapping Format

Common Workflows

Automated Orchestration (Recommended)

Manual Stage-by-Stage Workflow

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Cache		Cache
Config		Config
Docs		Docs
Logs		Logs
Modules		Modules
Output		Output
Prompts		Prompts
Scripts		Scripts
Tools		Tools
.gitignore		.gitignore
README.md		README.md
Start-MediaProcessing.ps1		Start-MediaProcessing.ps1
Test-FullWorkflow.ps1		Test-FullWorkflow.ps1

Folders and files

Latest commit

History

Repository files navigation

Media Processing Scripts

Quick Start

Workflow Overview

Metadata Extraction Strategy

Stage 1: Embedded Metadata Priority

Hybrid Approach (Mixed Metadata)

Stage 2: Conditional API Lookups

Performance Benefits

Requirements

Directory Structure

Configuration

Claude API Pricing

Stage 1: Analyze Ripped Media

Stage 2: Lookup Metadata

API Response Caching

Setup (One-time)

Usage

Stage 3: Review Metadata

Stage 4: Rename Files

Cleanup Utility

JSON Mapping Format

Common Workflows

Automated Orchestration (Recommended)

Manual Stage-by-Stage Workflow

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages