image-prompt-expander

View Live Demo — See example outputs from the generator

A procedural image prompt generator that creates varied, high-quality prompts for FLUX.2 image models, with optional local image generation using mflux.

User prompt → LLM generates Tracery grammar → Tracery produces N prompts → (optional) mflux generates images

Requirements

System:

Python 3.10+
macOS with Apple Silicon (M1/M2/M3/M4) for image generation
LM Studio running locally

Recommended Hardware:

M4 Max with 36GB+ unified memory for concurrent generation + enhancement
M1/M2/M3 with 16GB+ works but may require --enhance-after for large batches
Generation speed: ~2-4 images/minute (z-image-turbo, 864x1152)

Python Dependencies:

openai - LM Studio API client
click - CLI framework
tracery - Grammar expansion
mflux - Image generation (optional, Apple Silicon only)
fastapi + uvicorn + sse-starlette - Web UI server

Installation

git clone https://github.com/fabian20ro/image-prompt-expander.git
cd image-prompt-expander

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Important: Activate the virtual environment each new terminal session:

source venv/bin/activate

Then install LM Studio, download a model (e.g., Qwen 2.5 7B, Llama 3.1 8B), and start the local server.

Usage

Web UI (Recommended)

Start the interactive web interface:

python src/cli.py --serve

This opens http://localhost:8000 with:

New Generation Form: Queue grammar + prompt generation (prompt, prefix, count, model, temperature, cache, tiled VAE)
Gallery Browser: View and manage all existing galleries
Live Progress + Logs: Real-time status and worker log streaming via SSE
Queue Management: Queue multiple operations, clear pending queue, kill running tasks
Gallery Deletion: Queue deletion for active galleries from the index page

Image generation and enhancement settings are configured per-gallery (not from the index form).

Gallery pages include:

Edit Grammar: Modify Tracery grammar and regenerate prompts
Generate Images: Queue individual or all images for generation
Enhance Images: Apply SeedVR2 enhancement to individual or all images
Save to Archive: Manually backup the current gallery state
Kill/Clear: Stop current task or clear pending queue
Back to Index: Navigate back to the master index

Auto-Backup: The system automatically creates backups before destructive operations (regenerating prompts when images exist, enhancing all images). Archives are saved as flat PNG files in generated/saved/ with metadata embedded in PNG text chunks (prompt, model, settings). Archives appear as image grids in the "Archived Images" section on the index.

CLI: Basic (Text Prompts Only)

# Generate 50 prompt variations (default)
python src/cli.py -p "a dragon flying over mountains"

# Generate fewer variations
python src/cli.py -p "a cat sleeping on a bookshelf" -n 50

# Preview grammar without creating files
python src/cli.py -p "a cyberpunk city at night" --dry-run

With Image Generation

# Generate prompts AND images (Apple Silicon only)
python src/cli.py -p "a dragon flying over mountains" -n 5 \
    --generate-images \
    --prefix dragon

# Multiple images per prompt
python src/cli.py -p "a mystical forest" -n 10 \
    --generate-images \
    --images-per-prompt 3 \
    --prefix forest

# Limit how many prompts get rendered
python src/cli.py -p "abstract art" -n 100 \
    --generate-images \
    --max-prompts 10 \
    --prefix abstract

Custom Image Settings

# Different model (auto-selects optimized prompt structure)
python src/cli.py -p "portrait of a wizard" -n 5 -i \
    --model flux2-klein-4b \
    --prefix wizard

# Custom resolution and steps
python src/cli.py -p "landscape painting" -n 5 -i \
    --width 1024 --height 768 --steps 8 \
    --prefix landscape

# Reproducible with seed
python src/cli.py -p "abstract pattern" -n 3 -i \
    --seed 42 \
    --prefix pattern

Image Enhancement (SeedVR2)

Enhance generated images with 2x upscaling using SeedVR2. Enhanced images replace the originals:

# Generate images with automatic 2x enhancement
python src/cli.py -p "a cat sleeping" -n 3 -i \
    --enhance \
    --prefix cat

# Adjust enhancement softness (0.0-1.0, default: 0.5)
python src/cli.py -p "portrait" -n 1 -i \
    --enhance --enhance-softness 0.3 \
    --prefix portrait

# Memory-efficient batch enhancement (for large batches)
# Defers enhancement until after all images are generated
python src/cli.py -p "a cat sleeping" -n 50 -i \
    --enhance --enhance-after \
    --prefix cat

Standalone Enhancement

Enhance existing images in-place (replaces originals):

# Enhance a single image
python src/cli.py --enhance-images path/to/image.png

# Enhance all images in a folder
python src/cli.py --enhance-images generated/prompts/myrun/

# Enhance using glob pattern
python src/cli.py --enhance-images "generated/prompts/*/cat_*.png"

# With custom softness
python src/cli.py --enhance-images folder/ --enhance-softness 0.7

Resume from Intermediate Steps

# Resume from cached grammar (skip LLM generation)
python src/cli.py --from-grammar generated/grammars/abc123.tracery.json \
    -n 100 --prefix dragon2

# Resume from existing prompts (generate images only)
python src/cli.py --from-prompts generated/prompts/abc123_20260124_122208 \
    --generate-images --images-per-prompt 2

Cleanup

python src/cli.py --clean

CLI Options

Option	Description
`-p, --prompt TEXT`	Image description to generate variations for
`-n, --count INT`	Number of variations (default: 50)
`-o, --output PATH`	Custom output directory
`--prefix TEXT`	Output file prefix (default: "image")
`--dry-run`	Preview grammar only
`--no-cache`	Force regenerate grammar
`--clean`	Remove all generated files
`--base-url TEXT`	LM Studio URL (default: http://localhost:1234/v1)
`--temperature FLOAT`	LLM temperature (default: 0.7)
`--from-grammar PATH`	Resume from existing grammar file (skip LLM generation)
`--from-prompts PATH`	Resume from existing prompts directory (images only)
`-i, --generate-images`	Enable mflux image generation
`--images-per-prompt INT`	Images per prompt (default: 1)
`--max-prompts INT`	Limit prompts to render
`-m, --model`	`z-image-turbo`, `flux2-klein-4b`, `flux2-klein-9b` (default: `flux2-klein-4b`)
`--steps INT`	Inference steps
`--width INT`	Image width (default: 864)
`--height INT`	Image height (default: 1152)
`-q, --quantize`	Quantization: 3, 4, 5, 6, or 8
`--seed INT`	Random seed
`--enhance`	Enable SeedVR2 2x enhancement (replaces original)
`--enhance-softness FLOAT`	Enhancement softness (0.0-1.0, default: 0.5)
`--enhance-after`	Defer enhancement to after all images generated (saves memory)
`--enhance-images PATH`	Enhance existing images in-place (file, folder, or glob)
`--resume`	Skip already-generated images when resuming interrupted runs
`--no-tiled-vae`	Disable tiled VAE decoding (uses more memory, may be faster)
`--serve`	Start interactive web UI at http://localhost:8000
`--port INT`	Port for web UI server (default: 8000)

Output Structure

generated/
├── index.html                # Master index linking all galleries
├── queue.json                # Task queue persistence for web UI
├── grammars/                 # Cached grammars (by prompt hash)
├── prompts/{timestamp}_{hash}/    # Active generation runs
│   ├── dragon_0.txt          # First prompt
│   ├── dragon_0_0.png        # First image (enhanced in-place if --enhance)
│   ├── dragon_0_1.png        # Second image (if --images-per-prompt 2)
│   ├── dragon_1.txt          # Second prompt
│   ├── dragon_1_0.png
│   ├── ...
│   ├── dragon_gallery.html   # Gallery generated dynamically via --serve
│   ├── dragon_grammar.json   # Tracery grammar used
│   └── dragon_metadata.json  # Generation settings
└── saved/                    # Flat archived images
    ├── dragon_20260126_143052_0_0.png  # {prefix}_{timestamp}_{promptIdx}_{imgIdx}.png
    ├── dragon_20260126_143052_1_0.png  # Metadata embedded in PNG text chunks
    └── ...

The master index at generated/index.html provides a unified entry point to browse all generation runs with thumbnails and metadata. Archives appear as image grids in the "Archived Images" section, grouped by prefix and timestamp. Archive metadata (prompt, model, settings) is embedded directly in PNG text chunks for self-contained files. Grammars are cached and reused for identical prompts.

How It Works

Grammar Generation - Your prompt is sent to a local LLM (recommended: GLM-4.7-Flash via LM Studio) with model-specific instructions to create a Tracery grammar. The grammar locks elements you specified and varies everything else. Different models use different prompt structures (camera-first for z-image-turbo, prose-based for flux2-klein).
Prompt Expansion - The grammar is expanded N times, randomly selecting from options to create diverse but coherent prompts.
Image Generation (optional) - Each prompt is rendered using mflux on Apple Silicon.
Image Enhancement (optional) - Images are enhanced in-place with SeedVR2 2x upscaling, replacing the originals with higher quality versions.

Supported Models

Model	Parameters	Default Steps	Notes
z-image-turbo	6B	9	Fast, good quality
flux2-klein-4b	4B	4	Very fast, lighter (CLI/API default)
flux2-klein-9b	9B	4	Best quality

Pre-quantized 4-bit versions are used automatically when available.

Prompt Tips

Be specific about constants: "a RED dragon with GOLDEN eyes"
Describe scene structure: "a warrior standing on a cliff overlooking a battlefield"
Suggest variation dimensions: "a cat in various cozy indoor settings"
Use FLUX-friendly language: lighting ("golden hour"), atmosphere ("epic", "serene")
Front-load important elements (FLUX prioritizes earlier content)

Configuration

Settings can be overridden via environment variables with PROMPT_GEN_ prefix:

# Use different LM Studio instance
export PROMPT_GEN_LM_STUDIO_URL="http://192.168.1.100:1234/v1"

# Change default image dimensions
export PROMPT_GEN_DEFAULT_WIDTH=1024
export PROMPT_GEN_DEFAULT_HEIGHT=768

Troubleshooting

"Connection refused" - Start LM Studio and ensure the server is running.

"mflux is required" - Run pip install mflux (requires Apple Silicon).

"Invalid JSON grammar" - Try --no-cache or use a different LLM model.

Slow generation - First run downloads model weights. Use --steps 4 or flux2-klein-4b for speed.

Out of memory - Use --enhance-after for batch enhancement (single model at a time). Reduce resolution, use flux2-klein-4b, or use --no-tiled-vae to trade memory for speed.

Development

Setup

git clone https://github.com/fabian20ro/image-prompt-expander.git
cd image-prompt-expander
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-dev.txt  # pytest, pytest-asyncio, pytest-cov

Running Tests

source venv/bin/activate

# Run all tests
pytest -v --tb=short

# Run specific test file
pytest tests/test_pipeline.py -v

# Run with coverage
pytest --cov=src --cov-report=html

The test suite currently collects 271 tests covering:

Pipeline orchestration (test_pipeline.py)
Image generation (test_image_generator.py)
Grammar expansion (test_tracery_runner.py)
API routes (test_routes.py)
Background workers (test_worker.py, test_worker_subprocess.py)
Metadata management (test_metadata_manager.py)
Utility functions (test_utils.py, test_gallery.py)

Architecture

src/
├── cli.py                 # CLI entry point
├── pipeline.py            # Core orchestration (PipelineExecutor)
├── metadata_manager.py    # Centralized metadata operations
├── grammar_generator.py   # LLM-based grammar generation
├── tracery_runner.py      # Grammar expansion
├── image_generator.py     # mflux image generation
├── image_enhancer.py      # SeedVR2 enhancement
├── gallery.py             # Gallery HTML generation
├── gallery_index.py       # Master index generation
├── utils.py               # Shared utilities
├── config.py              # Path configuration
└── server/
    ├── app.py             # FastAPI application
    ├── routes.py          # API endpoints
    ├── models.py          # Pydantic models
    ├── worker.py          # Background task processor
    ├── worker_subprocess.py  # Isolated task execution
    └── queue_manager.py   # Task queue management

Key patterns:

MetadataManager: Use for all run metadata operations instead of raw JSON
PipelineConfig: Dataclass for grouping pipeline parameters
FastAPI DI: Routes use Depends() with lru_cache for service singletons

Contributing

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass: pytest -v --tb=short
Submit a pull request

Credits

Fifty Shades Generator by Lisa Wray - original inspiration
Tracery by Kate Compton - grammar expansion library
mflux by Filip Strand - MLX-based image generation for Apple Silicon, including pre-quantized model weights
Z-Image-Turbo by Tongyi-MAI - supported image model (6B parameters)
FLUX models by Black Forest Labs - flux2-klein image models

License

See LICENSE file.

Roadmap

LoRA support for custom styles (if requested)

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.claude		.claude
docs		docs
generated		generated
src		src
templates		templates
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
ITERATION_LOG.md		ITERATION_LOG.md
LESSONS_LEARNED.md		LESSONS_LEARNED.md
LICENSE		LICENSE
README.md		README.md
SETUP_AI_AGENT_CONFIG.md		SETUP_AI_AGENT_CONFIG.md
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

image-prompt-expander

Requirements

Installation

Usage

Web UI (Recommended)

CLI: Basic (Text Prompts Only)

With Image Generation

Custom Image Settings

Image Enhancement (SeedVR2)

Standalone Enhancement

Resume from Intermediate Steps

Cleanup

CLI Options

Output Structure

How It Works

Supported Models

Prompt Tips

Configuration

Troubleshooting

Development

Setup

Running Tests

Architecture

Contributing

Credits

License

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

image-prompt-expander

Requirements

Installation

Usage

Web UI (Recommended)

CLI: Basic (Text Prompts Only)

With Image Generation

Custom Image Settings

Image Enhancement (SeedVR2)

Standalone Enhancement

Resume from Intermediate Steps

Cleanup

CLI Options

Output Structure

How It Works

Supported Models

Prompt Tips

Configuration

Troubleshooting

Development

Setup

Running Tests

Architecture

Contributing

Credits

License

Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages