Skip to content

amerob/VisionLane

Repository files navigation

VisionLane: Diagram Analysis Chatbot 🌐

A Streamlit-based chatbot frontend designed to analyze diagrams and flowcharts using advanced vision models. Supports both local Ollama models and Google's Gemini API for flexible deployment. 🚀

Overview 📖

VisionLane provides an intuitive interface for analyzing diagrams and flowcharts through a chat-based system. It leverages cutting-edge vision models to interpret visual data, offering seamless integration with both local and cloud-based models. 🖼️

Features ✨

  • Image Upload: Supports PNG and JPG formats for diagram analysis. 📤
  • Interactive Chat Interface: Clearly displays user and AI roles for smooth interaction. 💬
  • Persistent Chat Sessions: Maintains conversation history across sessions. 📜
  • Export Functionality: Export chat history as JSON for easy record-keeping. 💾
  • Minimalistic UI: Clean design with intuitive sidebar navigation. 🖥️
  • Model Flexibility: Supports local (Ollama) and cloud-based (Gemini) vision models. 🌍
  • Diagram Parsing: Automatically extracts structured JSON from diagrams. 📊

Requirements 🛠️

  • Python 3.8 or higher
  • Streamlit
  • Pillow
  • google-generativeai (for Gemini models)
  • Ollama (for local models)

Install dependencies using:

pip install -r requirements.txt

Setup Instructions 🔧

1. Local Model Setup (Ollama) 🖥️

  • Install Ollama from ollama.ai.
  • Pull the LLaMA 3.2 Vision model:
    ollama pull llama3.2-vision
    
  • Run the model in the background:
    ollama run llama3.2-vision
    

2. Online Model Setup (Gemini) ☁️

  • Obtain an API key from Google AI Studio.
  • Enable the "Use Online Models" option in VisionLane.
  • Configure your API key via the "Configure API Key" button.

3. Running VisionLane 🚀

Launch the Streamlit app with:

streamlit run app.py

Usage Instructions 📋

  1. Select the model (Ollama or Gemini) using the sidebar checkbox. ✅
  2. For Gemini, configure your API key as prompted. 🔑
  3. Upload a flowchart or diagram (PNG/JPG) using the file uploader. 🖼️
  4. Enter your question about the diagram in the chat input field. ❓
  5. Review the AI's response in the chat interface. 📝
  6. Continue the conversation or start a new session using sidebar controls. 🔄
  7. Export chat history as a JSON file for record-keeping. 💾
  8. Use the "Parse Diagram" button to generate structured JSON output from diagrams. 📈

Model Evaluation 📊

The repository includes a Jupyter notebook (model_evaluation.ipynb) for assessing vision model performance on diagram analysis. It provides:

  • Automated testing across multiple models. 🧪
  • Response time metrics. ⏱️
  • Visualization tools for performance comparison. 📉
  • Manual scoring framework for response quality. ✅

References 📚

Models

Additional Resources

About

A streamlit-based chatbot for analyzing diagrams and flowcharts, using Ollama (LLaMA 3.2 Vision, LLaVA) and Gemini 1.5 Flash models, supporting image upload, persistent chat with chat history retention, JSON export, and automated diagram parsing into structured json..

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors