Skip to content

Latest commit

 

History

History
63 lines (46 loc) · 1.84 KB

File metadata and controls

63 lines (46 loc) · 1.84 KB

VisionaryAI (formerly GeminiFusion)

VisionaryAI is a versatile web application that leverages advanced AI models, including Gemini Flash, DALL-E 3, and Stable Diffusion XL, to provide three main features: Chatbot Interaction, Image Captioning, and Text-to-Image Generation.

Features

  • ChatBot: Engage in real-time conversations with the AI, powered by the Gemini Flash model.
  • Image Captioning: Generate descriptive captions for your images using the Gemini Flash model.
  • Text to Image: Generate images using either DALL-E 3 or Stable Diffusion XL.

Installation

  1. Clone the repository:

    git clone https://github.com/Abhrankan-Chakrabarti/GeminiFusion.git
    cd GeminiFusion
  2. Create a virtual environment (optional but recommended):

    python -m venv venv
    source venv/bin/activate  # On Windows use `venv\Scripts\activate`
  3. Install dependencies:

    pip install -r requirements.txt
  4. Set up environment variables:

    • Create a .env file in the root directory.
    • Add your Google API key:
      api_key=YOUR_GOOGLE_API_KEY
      

Usage

  1. Run the application:

    streamlit run app.py
  2. Features:

    • ChatBot: Navigate to the ChatBot section to start a conversation with the AI.
    • Image Captioning: Upload an image and enter a prompt to generate a caption.
    • Text to Image: Enter a text prompt to generate images using either DALL-E 3 or Stable Diffusion XL.

Technology Stack

  • Python
  • Streamlit
  • Google Gemini Flash
  • DALL-E 3
  • Stable Diffusion XL

Contributing

We welcome contributions! Please see our contribution guidelines for more information.

License

This project is licensed under the MIT License. See the LICENSE file for details.