ShortGen

AI-powered pipeline for generating YouTube Shorts and Instagram Reels from a text summary.

How It Works

You provide a 2-3 sentence description of the video topic. The pipeline:

Researches the topic via Perplexity API for current facts and data
Writes a script using a local LLM (Ollama) — produces scene-by-scene visual descriptions and narration text
Generates speech (Kokoro TTS) and video (Wan 2.1, currently stubbed) in parallel
Assembles everything into a final 9:16 MP4 with synced audio

"AI is transforming coding in 2026"
         │
         ▼
  ┌──────────────┐
  │  Research     │ → facts, stats, sources
  └──────┬───────┘
         ▼
  ┌──────────────┐
  │  Script       │ → 4 scenes with narration + visual prompts
  └──────┬───────┘
    ┌────┴────┐
    ▼         ▼
 ┌─────┐  ┌───────┐
 │ TTS │  │ Video │  (parallel)
 └──┬──┘  └──┬────┘
    └────┬────┘
         ▼
  ┌──────────────┐
  │  Assembly     │ → output/20260201_143022_ai_is.../final.mp4
  └──────────────┘

Quick Start

# Install
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

# Configure
cp config.example.toml config.toml
# Edit config.toml — add your Perplexity API key, adjust Ollama model, etc.

# Ensure Ollama is running
ollama pull mistral

# Generate (dry run — research + script only, no media)
shortgen generate --dry-run "AI is transforming how developers write code in 2026"

# Generate (full pipeline with stub video)
shortgen generate "AI is transforming how developers write code in 2026"

Prerequisites

Python 3.11+
FFmpeg: brew install ffmpeg
Ollama: running locally with a model pulled (e.g., ollama pull mistral)
Perplexity API key: set in config.toml or SHORTGEN_PERPLEXITY_API_KEY env var

For TTS (optional, needed for audio generation):

pip install -e ".[local]"   # installs Kokoro TTS

CLI

Step-by-step (recommended)

Run each pipeline stage individually. After the first command, subsequent stages auto-pick the most recent job — no need to pass paths:

shortgen script "AI tools for developers in 2026"   # research + write script
shortgen tts                                          # generate audio
shortgen video                                        # generate scene videos
shortgen assemble                                     # combine into final.mp4

You can also point at a specific job directory:

shortgen tts output/20260201_143022_ai_tools_for_2026/

Full pipeline (convenience)

shortgen generate "your topic summary"     # runs all stages end-to-end
shortgen generate --dry-run "summary"      # research + script only
shortgen generate --verbose "summary"      # with debug logging

Other commands

shortgen config                            # show resolved configuration
shortgen --version                         # show version

Configuration

Copy config.example.toml to config.toml and edit. Key sections:

[research]
provider = "perplexity"          # research provider

[scriptwriter]
provider = "ollama"              # "ollama" for local LLM
target_scene_count = 4           # scenes per video
target_duration_seconds = 45     # target video length

[tts]
provider = "kokoro"              # local TTS on CPU

[video]
provider = "stub"                # "stub" until Wan 2.1 is implemented

Environment variables override config file values:

SHORTGEN_PERPLEXITY_API_KEY
ANTHROPIC_API_KEY (for future Claude scriptwriter)

Output

Each run creates a job directory under output/:

output/20260201_143022_ai_is_transforming/
├── job.json          # job metadata + stages_completed tracker
├── research.json     # research findings, sources, raw text
├── script.json       # generated script with scenes
├── tts.json          # TTS metadata (scene timings)
├── audio.wav         # TTS audio
├── scenes/           # per-scene video clips
│   ├── scene_000.mp4
│   ├── scene_001.mp4
│   └── ...
└── final.mp4         # assembled output

Project Structure

src/shortgen/
├── cli.py              # Click CLI entry point
├── config.py           # Pydantic config + component factory
├── pipeline.py         # Async orchestrator
├── models.py           # Data models (Scene, Script, TTSResult, etc.)
├── log.py              # Logging setup
├── research/           # Web research providers
│   ├── base.py         #   Researcher Protocol
│   └── perplexity.py   #   Perplexity API
├── scriptwriter/       # Script generation providers
│   ├── base.py         #   ScriptWriter Protocol
│   └── ollama.py       #   Ollama (local LLM)
├── tts/                # Text-to-speech providers
│   ├── base.py         #   TTSEngine Protocol
│   └── kokoro.py       #   Kokoro TTS (local, CPU)
├── video/              # Video generation providers
│   ├── base.py         #   VideoGenerator Protocol
│   └── stub.py         #   Placeholder (colored rectangles)
└── assembly/           # Video assembly providers
    ├── base.py         #   Assembler Protocol
    └── ffmpeg.py       #   MoviePy + FFmpeg

Extending

Each pipeline stage uses a Protocol interface. To add a new provider:

Create a new file in the stage's directory (e.g., tts/elevenlabs.py)
Implement the Protocol (e.g., TTSEngine — just needs matching method signature)
Add a config model in config.py
Register in the factory function
Add config section to config.example.toml

See docs/extending.md for a step-by-step example.

Development

pip install -e ".[dev]"
pytest                  # run tests (15 tests)
ruff check .            # lint
mypy src/               # type check

Current State (v0.1.0)

Full pipeline scaffold with async orchestration
Working providers: Perplexity research, Ollama scriptwriter, Kokoro TTS, FFmpeg assembly
Video generation is stubbed (placeholder colored rectangles with text)
30 tests passing, lint clean

See docs/TODO.md for planned work and docs/sessions/ for session history.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude		.claude
.github		.github
docs		docs
src/shortgen		src/shortgen
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
README.md		README.md
config.example.toml		config.example.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ShortGen

How It Works

Quick Start

Prerequisites

CLI

Step-by-step (recommended)

Full pipeline (convenience)

Other commands

Configuration

Output

Project Structure

Extending

Development

Current State (v0.1.0)

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

ShortGen

How It Works

Quick Start

Prerequisites

CLI

Step-by-step (recommended)

Full pipeline (convenience)

Other commands

Configuration

Output

Project Structure

Extending

Development

Current State (v0.1.0)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages