Generate videos, images, and music with Google's AI models using simple text prompts
π¬ Professional Multi-Modal AI Media Generation Tool with an intuitive terminal interface
# Start the beautiful interactive experience
ai-studio interactiveInteractive Features:
- π¨ Smart Media Selection - Choose between Video, Image, or Music generation
- π€ AI Model Showcase - Compare capabilities of different models
- π Visual Parameter Config - Real-time validation and previews
- π‘ Prompt Writing Guide - Built-in tips for each media type
- π Generation Preview - See what you're about to create
- π― Intelligent Recommendations - AI-powered suggestions
- π Automatic Organization - Smart file management by media type
- π¬ Video Generation with Google's latest Veo models
- πΌοΈ Image Creation with Imagen (coming soon)
- π΅ Music Composition with MusicLM (planned)
- π Beautiful Interactive Mode for easy media generation
- π Smart Organization - Auto-sorts all your generated media
- β‘ Fast Generation - Optimized for quick results
- π¨ Professional Quality - Using Google's best AI models
- π» Developer Friendly - Clean CLI with intuitive commands
- π Extensible - Easy to add new models and features
# Clone the repository
git clone https://github.com/Abdulrahman-Elsmmany/ai-media-studio-cli.git
cd ai-media-studio-cli
# Install dependencies with UV (recommended)
uv sync
# Alternative: Install with pip
pip install -e .Create your .env file with Google AI credentials:
# π Google AI API Configuration
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GOOGLE_GENAI_USE_VERTEXAI=True
GOOGLE_API_KEY=your-google-api-key
# πͺ£ Google Cloud Storage Configuration
GOOGLE_CLOUD_STORAGE_BUCKET=your-bucket-name
GOOGLE_CLOUD_STORAGE_PATH=videos# π¬ Generate your first video
ai-studio generate video -p "a cinematic sunset over mountains"
# π Or use interactive mode (recommended)
ai-studio interactive
|
|
|
|
# Simple video generation
ai-studio generate video -p "a majestic eagle soaring over mountains"
# With specific model and settings
ai-studio generate video \
--prompt "cinematic drone shot of ocean waves at sunset" \
--model veo3-001 \
--aspect-ratio 16:9# All generated media is auto-organized:
downloaded_media/
βββ videos/ # .mp4, .avi, .mov files
βββ images/ # .jpg, .png, .gif files
βββ audios/ # .mp3, .wav, .flac files
βββ unknown/ # Other file types# π
Cinematic landscape video
ai-studio generate video \
--prompt "golden hour cinematic shot of a serene lake with mountains reflected in still water" \
--model veo3-001 \
--aspect-ratio 16:9 \
--resolution 1080 \
--videos 2 \
--duration 8
# π± Social media vertical video
ai-studio generate video \
--prompt "trendy coffee shop aesthetic with latte art being created" \
--model veo2-001 \
--aspect-ratio 9:16 \
--resolution 720 \
--duration 6# π Extend video from Google Cloud Storage
ai-studio generate video \
--prompt "the butterfly gracefully lands on a blooming flower petal" \
--model veo2-001 \
--extend-video "gs://your-bucket/nature-scene.mp4"
# π Extend local video file
ai-studio generate video \
--prompt "the sunset transforms into a starry night sky" \
--model veo2-001 \
--extend-video "./videos/sunset-base.mp4"# π¨ High-resolution artwork
ai-studio generate image \
--prompt "abstract digital art with vibrant colors and geometric patterns" \
--model imagen-3-ultra \
--resolution 2048x2048 \
--style artistic
# π’ Professional photography
ai-studio generate image \
--prompt "modern office interior with natural lighting" \
--model imagen-3-001 \
--resolution 1920x1080 \
--style photorealistic| Model | π― Best For | Videos | Duration | π Special Features |
|---|---|---|---|---|
| veo2-001 | π¨ Creative & Flexible | 4 | 5-8s | β Video Extension, πΌοΈ Image-to-Video |
| veo3-001 | π¬ Professional & Stable | 4 | 8s | β¨ AI Prompt Enhancement |
| veo3-preview | π¬ Latest Features | 4 | 8s | πΌοΈ Image-to-Video, π Beta Features |
| Model | π― Best For | Images | Resolution | π Special Features |
|---|---|---|---|---|
| imagen-3-ultra | π¨ Ultra High Quality | 12 | Up to 4K | π¨ Style Control, β‘ Fast Generation |
| imagen-3-001 | πΈ Photorealistic | 8 | Up to 2K | π· Photo-realistic, π Face Generation |
| Model | π― Best For | Length | Quality | π Special Features |
|---|---|---|---|---|
| musiclm-v2 | πΌ Composition | 30-120s | Hi-Fi | πΉ Instrument Control, π΅ Genre Specific |
[STYLE] + [SUBJECT] + [ACTION] + [SETTING] + [TECHNICAL] + [MOOD]
# Cinematic
"Cinematic wide shot of a lone figure walking through misty forest path, golden morning light filtering through ancient trees, slow dolly forward, mysterious atmosphere"
# Documentary
"Documentary-style close-up of artisan hands crafting pottery on spinning wheel, natural lighting, steady camera, focused concentration"# Artistic
"Abstract expressionist painting with bold brushstrokes, vibrant blues and oranges, dynamic composition, oil on canvas texture"
# Photographic
"Professional headshot of businesswoman in modern office, soft natural lighting, shallow depth of field, confident expression"# Instrumental
"Uplifting piano melody with string accompaniment, major key, 120 BPM, inspiring and motivational mood"
# Ambient
"Ethereal ambient soundscape with nature sounds, gentle synthesizer pads, relaxing meditation music"β "make video" # Too vague, no media type specified
β "cool image of stuff" # Lacks specific details
β "amazing epic best music" # Over-hyped without substanceai-media-studio-cli/
βββ π¬ ai_media_studio_cli/
β βββ main.py # Unified CLI application
β βββ ui_components.py # Beautiful UI components
β βββ models_config.py # Multi-modal AI configurations
β βββ model_manager.py # Dynamic model handling
β βββ download.py # Smart media download & organization
β βββ animations.py # Progress & loading animations
β βββ generators/
β β βββ video.py # Video generation logic
β β βββ image.py # Image generation (coming soon)
β β βββ music.py # Music generation (planned)
βββ π docs/
β βββ ADDING_NEW_MODELS.md # Developer guide
β βββ VIDEO_GENERATION.md # Video-specific docs
β βββ ROADMAP.md # Future feature roadmap
βββ π§ͺ tests/ # Comprehensive test suite
β βββ test_download.py # Download functionality tests
β βββ test_models.py # Model integration tests
βββ βοΈ pyproject.toml # Modern Python packaging
βββ π README.md # This documentation
- Modular architecture for easy extension to new AI models
- Async processing for all media types and downloads
- Smart caching to reduce API costs
- Batch processing for efficient generation workflows
- Memory optimization for large media files
- Plugin system for third-party model integration
- Concurrent downloads with progress tracking
- Automatic file organization by media type
- GCS cleanup to minimize storage costs
- β Google Veo 2.0 & 3.0 integration
- β Video extension capabilities
- β Professional CLI interface
- β Smart media download & organization
- β Automatic folder structure (videos/, images/, audios/)
- β Concurrent downloads with progress tracking
- π Google Imagen integration
- π Multiple resolution support
- π Style control and customization
- π Batch image processing
- π Google MusicLM integration
- π Genre and style control
- π Custom length generation
- π Audio format optimization
- π Multi-modal workflows (video + music)
- π Template system for common use cases
- π Cloud storage integration (AWS, Azure)
- π API rate limiting and optimization
- π Advanced prompt engineering tools
# π Required - Google AI Configuration
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GOOGLE_GENAI_USE_VERTEXAI=True
GOOGLE_API_KEY=your-google-api-key
# πͺ£ Required - Google Cloud Storage
GOOGLE_CLOUD_STORAGE_BUCKET=your-bucket-name
GOOGLE_CLOUD_STORAGE_PATH=videos-
Get your Google AI API Key:
- Visit Google AI Studio
- Create a new API key for your project
- Add it to your
.envfile asGOOGLE_API_KEY
-
Configure Google Cloud Project:
GOOGLE_CLOUD_PROJECT: Your Google Cloud project IDGOOGLE_CLOUD_LOCATION: Recommended:us-central1GOOGLE_GENAI_USE_VERTEXAI: Set toTruefor production use
The tool requires a GCS bucket for temporary video storage during generation:
- Create a GCS bucket in your Google Cloud project
- Set environment variables:
GOOGLE_CLOUD_STORAGE_BUCKET: Your bucket name (e.g.,my-ai-videos)GOOGLE_CLOUD_STORAGE_PATH: Path within bucket (optional, defaults tovideos)
- Ensure permissions: Your service account needs
Storage Object Adminrole
Generated content is automatically:
- π Organized by media type (videos/, images/, audios/)
- π·οΈ Tagged with generation metadata
- π§Ή Cleaned up from cloud storage (optional)
- π Tracked with detailed analytics
- π Versioned for iterative workflows
- β‘ Downloaded concurrently with progress tracking
- π― Sorted by file extension into appropriate folders
- π¦ Supports 20+ media formats (MP4, JPG, MP3, etc.)
The CLI features an intelligent download system that automatically organizes your generated content:
# Downloads are automatically organized by media type
downloaded_media/
βββ videos/ # .mp4, .avi, .mov, .mkv, .wmv, .flv, .webm, .m4v, .3gp
βββ images/ # .jpg, .jpeg, .png, .gif, .bmp, .tiff, .svg, .webp, .ico
βββ audios/ # .mp3, .wav, .flac, .aac, .ogg, .wma, .m4a, .opus
βββ unknown/ # Unrecognized file types- Concurrent Downloads: Multiple files downloaded simultaneously
- Progress Tracking: Real-time progress bars with ETA
- Resume Support: Automatic retry on network interruptions
- GCS Cleanup: Optional cloud storage cleanup after download
- Memory Efficient: Streaming downloads for large files
# Disable automatic organization
ai-studio generate video --no-organize
# Custom download directory
ai-studio generate video --output-dir "my-custom-folder"
# Keep files in cloud storage (no cleanup)
ai-studio generate video --keep-cloud-filesWe welcome contributions that push the boundaries of AI media generation:
- π¬ Video Generation: New models, effects, transitions
- πΌοΈ Image Creation: Style transfer, artistic filters
- π΅ Music Composition: Instrument separation, rhythm generation
- π₯οΈ User Experience: Interface improvements, workflow optimization
- π§ Technical: Performance, architecture, new integrations
- Type hints for all functions across all modules
- Comprehensive docstrings with examples
- Unit tests with >95% coverage for new features
- Integration tests for AI model endpoints
- Performance benchmarks for generation workflows
This project is licensed under the MIT License - see the LICENSE file for complete details.
Third-party acknowledgments:
- π€ Google AI for Veo, Imagen, and MusicLM model access
- π¨ Rich for beautiful terminal UI
- β‘ Typer for modern CLI framework
- π§ UV for fast Python package management
The future of AI media generation in your terminal
Created with β€οΈ by Abdulrahman Elsmmany
β Star this repository if it helped you create amazing AI content!
Let's build the future of AI media generation together π
