VT

Multimodal AI Chat App with Dynamic Routing

VT.ai

VT.ai is a multimodal AI chat application designed to simplify interaction with different AI models through a unified interface. It employs vector-based semantic routing to direct queries to the most suitable model, eliminating the need to switch between multiple applications and interfaces.

Documentation

Key Features

Multi-Provider Integration: Unified access to models from OpenAI (o1/o3/4o), Anthropic (Claude), Google (Gemini), DeepSeek, Llama, Cohere, and local models via Ollama
Semantic Routing System: Vector-based classification automatically routes queries to appropriate models using FastEmbed embeddings, removing the need for manual model selection
Multimodal Capabilities: Comprehensive support for text, image, and audio inputs with advanced vision analysis
Image Generation: GPT-Image-1 integration with support for transparent backgrounds, multiple formats, and customizable quality parameters
Web Search Integration: Real-time information retrieval with source attribution via Tavily API
Voice Processing: Advanced speech-to-text and text-to-speech functionality with configurable voice options and silence detection
Reasoning Visualization: Step-by-step model reasoning visualization with the <think> tag for transparent AI decision processes

Installation & Setup

Multiple installation methods are available depending on requirements:

# Standard PyPI installation
uv pip install vtai

# Zero-installation experience with uvx
export OPENAI_API_KEY='your-key-here'
uvx vtai

# Development installation
git clone https://github.com/vinhnx/VT.ai.git
cd VT.ai
uv venv
source .venv/bin/activate  # Linux/Mac
uv pip install -e ".[dev]"  # Install with development dependencies

API Key Configuration

Configure API keys to enable specific model capabilities:

# Command-line configuration
vtai --api-key openai=sk-your-key-here

# Environment variable configuration
export OPENAI_API_KEY='sk-your-key-here'  # For OpenAI models
export ANTHROPIC_API_KEY='sk-ant-your-key-here'  # For Claude models
export GEMINI_API_KEY='your-key-here'  # For Gemini models

API keys are securely stored in ~/.config/vtai/.env for future use.

Usage Guide

Interface Usage

The application provides a clean, intuitive interface with the following capabilities:

Dynamic Conversations: The semantic router automatically selects the most appropriate model for each query
Image Generation: Create images using prompts like "generate an image of..." or "draw a..."
Visual Analysis: Upload or provide URLs to analyze visual content
Reasoning Visualization: Add <think> to prompts to observe step-by-step reasoning
Voice Interaction: Use the microphone feature for speech input and text-to-speech output

Detailed usage instructions are available in the Getting Started Guide.

Documentation

The documentation is organized into sections designed for different user needs:

User Guide: Installation, configuration, and feature documentation
Developer Guide: Architecture details, extension points, and implementation information
API Reference: Comprehensive API documentation for programmatic usage

Implementation Options

VT.ai offers two distinct implementations:

Python Implementation: Full-featured reference implementation with complete support for all capabilities
Rust Implementation: High-performance alternative with optimized memory usage and native compiled speed

The implementation documentation provides a detailed comparison of both options.

Supported Models

Category	Models
Chat	GPT-o1, GPT-o3 Mini, GPT-4o, Claude 3.5/3.7, Gemini 2.0/2.5
Vision	GPT-4o, Gemini 1.5 Pro/Flash, Claude 3, Llama3.2 Vision
Image Gen	GPT-Image-1 with custom parameters
TTS	GPT-4o mini TTS, TTS-1, TTS-1-HD
Local	Llama3, Mistral, DeepSeek R1 (1.5B to 70B via Ollama)

The Models Documentation provides detailed information about model-specific capabilities and configuration options.

Technical Architecture

VT.ai leverages several open-source projects to deliver its functionality:

Chainlit: Modern chat interface framework
LiteLLM: Unified model abstraction layer
SemanticRouter: Intent classification system
FastEmbed: Efficient embedding generation
Tavily: Web search capabilities

The application architecture follows a clean, modular design:

Entry Point: vtai/app.py - Main application logic
Routing Layer: vtai/router/ - Semantic classification system
Assistants: vtai/assistants/ - Specialized handlers for different query types
Tools: vtai/tools/ - Web search, file operations, and other integrations

Contributing

Contributions to VT.ai are welcome. The project accepts various types of contributions:

Bug Reports: Submit detailed GitHub issues for any bugs encountered
Feature Requests: Propose new functionality through GitHub issues
Pull Requests: Submit code improvements and bug fixes
Documentation: Enhance documentation or add examples
Feedback: Share user experiences to help improve the project

Development setup:

# Clone the repository
git clone https://github.com/vinhnx/VT.ai.git
cd VT.ai

# Set up development environment
uv venv
source .venv/bin/activate  # Linux/Mac
uv pip install -e ".[dev]"

chainlit run vtai/app

# Run tests
pytest

Testing and Quality

Quality is maintained through comprehensive testing:

# Run the test suite
pytest

# Run with coverage reporting
pytest --cov=vtai

# Run specific test categories
pytest tests/unit/
pytest tests/integration/

License

VT.ai is available under the MIT License - See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 375 Commits
.github		.github
.vscode		.vscode
docs		docs
examples		examples
public		public
scripts		scripts
tests		tests
vtai		vtai
.chainlit		.chainlit
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
TODO		TODO
chainlit.md		chainlit.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

VT

VT.ai

Key Features

Installation & Setup

API Key Configuration

Usage Guide

Interface Usage

Documentation

Implementation Options

Supported Models

Technical Architecture

Contributing

Testing and Quality

License

About

Uh oh!

Releases 11

Sponsor this project

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

Uh oh!

License

vinhnx/VT.ai

Folders and files

Latest commit

History

Repository files navigation

VT

VT.ai

Key Features

Installation & Setup

API Key Configuration

Usage Guide

Interface Usage

Documentation

Implementation Options

Supported Models

Technical Architecture

Contributing

Testing and Quality

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Sponsor this project

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages