Translate your ideas β across text, images, and files β into high-quality, designer-grade prompts for any AI vibe-coding platform.
This tool helps users who work visually or conceptually β not just through words β create clearer, more effective prompts for AI design and coding workflows (e.g., V0, Magic Pattern, or similar vibe coding tools).
It automatically:
- πΈ Extracts key insights from images, UI screenshots, or files
- π Refines messy or incomplete text input
- π§© Merges all input into a structured, professional-quality prompt
- π‘ Optimizes for better AI output with fewer tokens
Not every designer, developer, or maker is a prompt expert. Many users express their intent through images, sketches, Figma files, or rough notes β but today's AI tools often require precise, text-only input to perform well.
This project bridges that gap.
While we're not yet doing full context engineering, this project is a step in that direction. We aim to:
Make it easy for anyone to express intent in a multimodal way β and automatically translate that into agent-friendly context.
This aligns with emerging trends in AI research around prompt structuring, context-aware agents, and efficient multimodal design workflows.
- π― Context-Aware Analysis Profiles: Specialized workflows for different vibe-coding scenarios
- π Modular Profile System: Create, edit, and manage custom analysis profiles as JSON files
- π Dynamic Prompt Generation: Automatically structure insights into AI-ready prompts
- π Professional Report Templates: Generate structured outputs optimized for downstream AI tools
- π οΈ Interactive Workflow: User-friendly menu system for multimodal input processing
- π Multiple Input Types: Support for local files, URLs, and various media formats
- π§© Template Customization: Each profile can define its own prompt structure and analysis steps
-
Install Dependencies:
pip install -r requirements.txt
-
Set Up API Key:
# Create .env file echo "OPENAI_API_KEY=your_key_here" > .env
-
Start Enhancing Your Prompts:
# Quick demo with UI screenshot analysis python quick_start.py # Full multimodal prompt enrichment workflow python vibe_mind.py # Direct context extraction (for developers) python chat_agent_with_image_analysis.py
-
Create Custom Workflows:
- Run
python vibe_mind.py - Choose "Manage Profiles" to create vibe-coding specific workflows
- Customize analysis steps for your target AI platform (V0, Magic Pattern, etc.)
- Generate optimized prompts for better AI output
- Run
from chat_agent_with_image_analysis import ImageAnalysisChatAgent
agent = ImageAnalysisChatAgent()
result = agent.analyze_image("image.jpg", "What's in this image?")
print(result)from vibe_mind import StructuredImageAnalyzer
# Initialize the multimodal context enhancer
enhancer = StructuredImageAnalyzer()
# Interactive workflow for vibe-coding platforms
# - Process images, files, and text input
# - Select target AI platform (V0, Magic Pattern, etc.)
# - Generate optimized, structured prompts# Create a V0.dev-optimized workflow
v0_workflow = {
'name': 'V0 Component Generator',
'target_platform': 'v0.dev',
'description': 'Convert UI concepts to V0-ready component prompts',
'extraction_steps': [
'Identify component structure and hierarchy',
'Extract design tokens and styling patterns',
'Map interactive elements and state management',
'Determine responsive behavior and props'
],
'prompt_template': '''Create a React component with the following specifications:
## Component Structure
{structure_analysis}
## Styling & Design
{styling_analysis}
## Functionality
{interaction_analysis}
## Props & State
{props_analysis}''',
'optimization_rules': {
'max_tokens': 1000,
'focus_areas': ['component_architecture', 'styling_specificity', 'responsive_design']
}
}
enhancer.create_profile('V0 Component Generator', v0_workflow)
# Process multimodal input β AI-ready prompt
result = enhancer.analyze_with_profile(
image_url='ui_screenshot.png',
profile=v0_workflow,
custom_text='Create a dashboard card component with hover effects'
)Vibe Mind uses OpenAI's GPT-4O Mini for multimodal analysis. Configure your API access:
# .env file
OPENAI_API_KEY=your_openai_api_key_here
# Optional: Customize for your preferred AI platform
TARGET_PLATFORM=v0.dev # or magic_pattern, lovable, etc.
MAX_PROMPT_TOKENS=1000
OPTIMIZATION_LEVEL=balanced # or speed, qualityCreate workflows optimized for your specific AI vibe-coding platform:
- Target Platform Setup: Define your AI tool (V0, Magic Pattern, Lovable, etc.)
- Context Extraction Rules: Specify what insights to extract from multimodal input
- Prompt Structure: Design the optimal format for your target AI
- Token Optimization: Configure efficiency rules for better performance
Each workflow profile contains:
- Target Platform: Specific AI tool or platform (V0, Magic Pattern, etc.)
- Extraction Steps: Multimodal analysis pipeline
- Prompt Template: AI platform-optimized format
- Optimization Rules: Token efficiency and clarity guidelines
- Output Format: Structured, AI-ready prompt
// V0.dev Example
{
"name": "V0 Component Builder",
"target_platform": "v0.dev",
"description": "Convert UI screenshots to V0-optimized component prompts",
"extraction_steps": [
"Identify component structure and hierarchy",
"Extract styling and design tokens",
"Determine interactive elements and states",
"Map data flow and props"
],
"prompt_template": "Create a React component...{structured_analysis}",
"optimization_rules": {
"max_tokens": 1000,
"focus_areas": ["functionality", "styling", "responsiveness"]
}
}
// Magic Pattern Example
{
"name": "Magic Pattern UI Generator",
"target_platform": "magic_pattern",
"description": "Convert design concepts to Magic Pattern-optimized prompts",
"extraction_steps": [
"Analyze visual hierarchy and layout patterns",
"Extract color schemes and typography",
"Identify interactive components and behaviors",
"Map responsive design requirements"
],
"prompt_template": "Generate a UI pattern with these specifications...{structured_analysis}",
"optimization_rules": {
"max_tokens": 800,
"focus_areas": ["design_patterns", "visual_consistency", "user_experience"]
}
}- π API for Programmatic Access: Integrate multimodal prompt enrichment into your existing workflows
- π― V0.dev Integration Demo: Direct integration with popular vibe-coding platforms
- π€ Plugin Support: Drag-and-drop for image/file upload and processing
- π§ Advanced Context Models: More sophisticated visual insight extraction
Have ideas? Want to collaborate or experiment?
Open an issue or submit a pull request β we're open to co-building.
# Input: Dashboard screenshot + "Create a data visualization component"
# Output: Optimized V0 prompt with component structure, styling, and props# Input: Figma export + interaction notes
# Output: Structured prompt for responsive component generation# Input: Hand-drawn wireframe + feature requirements
# Output: Detailed technical prompt for AI coding assistant# Input: App concept sketches + user flow descriptions
# Output: Structured prompt for full-stack app generationBuilt with CAMEL Framework β’ Ready for Demo
