A CLI tool that automates generating high-quality web demo MP4 videos by intelligently interacting with websites using AI-driven decision making.
- 🤖 AI-Driven Interactions: Uses OpenAI GPT-4+ to intelligently decide website interactions
- 🎬 High-Quality Video: Records 1080p 30fps MP4 videos optimized for web streaming
- 🗣️ Natural Narration: Generates speech using ElevenLabs for professional voiceovers
- 🔐 Automatic Authentication: Handles login forms automatically
- 📝 Transcription Export: Provides text transcripts of all narration
- 🛠️ Robust Error Handling: Comprehensive logging and error management
- Node.js 20 or newer
- FFmpeg (for video processing)
# Using pnpm (recommended)
pnpm install
# Or using npm
npm install- Copy the environment template:
cp .env.example .env- Configure your API keys in
.env:
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4-turbo-preview
# ElevenLabs Configuration
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
ELEVENLABS_VOICE_ID=your_preferred_voice_id_heremkdemo create --user <email> --password <password> --url <website-url>mkdemo create --user [email protected] --password mypassword --url https://propozio.com--user, -u <email>: User email for authentication (required)--password, -p <password>: User password for authentication (required)--url <url>: Website URL to create demo from (required)--output, -o <directory>: Output directory for generated files (default: ./output)--verbose, -v: Enable verbose logging--max-interactions <number>: Maximum number of interactions (default: 10)--headless: Run browser in headless mode (default: true)
Upon successful completion, the CLI generates:
demo_<timestamp>.mp4- Main video filetranscription_<timestamp>.txt- Narration transcriptmkdemo_<timestamp>.log- Detailed execution log
mkdemo/
├── src/
│ ├── cli/ # CLI command handling
│ ├── auth/ # Authentication logic
│ ├── browser/ # Browser automation
│ ├── ai/ # OpenAI integration
│ ├── audio/ # ElevenLabs integration
│ ├── video/ # FFmpeg video processing
│ ├── utils/ # Utilities (logging, filesystem)
│ └── index.js # Main orchestration
├── test/ # Test files
├── output/ # Generated output files
└── README.md
# Run all tests
pnpm test
# Run tests with coverage
pnpm run test:coverage
# Run tests in watch mode
pnpm run test:watch# Lint code
pnpm run lint
# Fix linting issues
pnpm run lint:fix
# Format code
pnpm run format
# Check formatting
pnpm run format:check# Run with file watching
pnpm run dev- CLI Parser (
src/cli/): Handles command-line argument parsing and validation - Browser Manager (
src/browser/): Puppeteer-based browser automation - Authentication Handler (
src/auth/): Automatic login form detection and filling - AI Decision Maker (
src/ai/): OpenAI integration for intelligent interactions - Audio Generator (
src/audio/): ElevenLabs speech synthesis - Video Processor (
src/video/): FFmpeg-based video recording and processing - Utilities (
src/utils/): Logging, filesystem operations, and helpers
- Parse CLI arguments and validate configuration
- Initialize browser and navigate to target URL
- Detect and handle authentication if required
- Capture initial page state for AI analysis
- Generate interaction plan using OpenAI
- Execute interactions while recording video
- Generate narration audio for each interaction
- Combine video and audio into final MP4
- Export transcription and logs
- Model: GPT-4 Turbo or newer recommended
- Usage: Interaction planning and narration generation
- Rate Limits: Consider your plan's token limits
- Voice: Choose a professional voice ID
- Usage: Speech synthesis for narration
- Rate Limits: Consider your plan's character limits
-
Browser Launch Fails
- Ensure you have sufficient system resources
- Try running with
--headless=falsefor debugging
-
Authentication Fails
- Verify credentials are correct
- Check if the website has CAPTCHA or 2FA
- Review logs for specific error messages
-
Video Processing Fails
- Ensure FFmpeg is installed and in PATH
- Check available disk space
- Verify output directory permissions
-
API Errors
- Verify API keys are correct and active
- Check API rate limits and quotas
- Review network connectivity
Enable verbose logging for detailed troubleshooting:
mkdemo create --verbose --user <email> --password <password> --url <url>- Fork the repository
- Create a feature branch
- Write tests for new functionality
- Ensure all tests pass
- Submit a pull request
MIT License - see LICENSE file for details.