Vector Search Agent - Automotive AI Knowledge System

Business Description

The Vector Search Agent is an enterprise-grade conversational AI platform designed to transform how organizations access and utilize technical automotive documentation. Built for automotive manufacturers, dealership networks, and technical service centers, this solution enables employees at all levels to retrieve precise technical information through natural language queries.

The Business Problem: Automotive technical documentation is vast, complex, and constantly evolving. Service technicians, support staff, and engineers spend significant time searching through manuals, diagnostic codes, and technical specifications. Traditional keyword-based search often fails to understand context, leading to frustration and inefficiency.

Our Solution: This platform uses advanced AI to understand the intent behind questions, search through technical documents intelligently, and provide accurate answers with traceable sources. When a technician asks "What is the torque specification for the cylinder head bolts on a Fiat Linea?", the system understands the technical context, retrieves relevant documentation, and provides a precise answer with source citations.

Key Business Benefits:

Reduced Resolution Time: Technical staff find answers in seconds rather than minutes or hours
Improved Accuracy: AI-powered responses include source citations, reducing errors from misinterpreted documentation
Knowledge Preservation: Institutional knowledge becomes searchable and accessible to all team members
Scalable Support: Handle increased query volumes without proportional staffing increases
Audit Trail: All responses are traceable to source documents for compliance and quality assurance

Intended Users:

Technical Service Staff: Mechanics and technicians seeking repair procedures and specifications
Support Centers: Call center agents helping customers with vehicle issues
Training Departments: Creating and validating training materials
Quality Assurance Teams: Verifying technical information accuracy

Technical Overview

This section provides the technical foundation for IT teams and developers who will implement and maintain the system.

Core Capabilities

Capability	Description
Specialized Agent	LangGraph-based system with specialized processing nodes for planning, retrieval, evaluation, and formatting
Automotive Domain	Knowledge base covering Fiat Linea vehicles, DTC diagnostic codes, technical procedures, and specifications
Intelligent Vector Search	Document processing (PDF and JSON) with semantic embedding techniques
Scope Validation	Automatic validation ensuring questions fall within the automotive knowledge domain
Real-time Streaming	Server-Sent Events (SSE) interface showing agent progress at each processing step
Traceable Sources	All responses include detailed references to consulted documents

Technology Stack

AI and Machine Learning:

Google Gemini (gemini-2.0-flash-exp) for language model processing
Google Gemini Embeddings (768 dimensions) for semantic document search
LangChain framework for LLM application integration
LangGraph for agent workflow orchestration

Backend Infrastructure:

FastAPI asynchronous web framework with automatic OpenAPI documentation
ChromaDB vector database for embeddings and metadata storage
Pydantic for data validation and schema management
Python 3.12.0 runtime environment

Document Processing:

PyPDF for PDF text extraction
JSON loaders for structured data
Text splitters with configurable chunking and overlap

Frontend and Interface:

Chainlit conversational interface
Server-Sent Events for real-time streaming
Command-based knowledge management

Infrastructure:

Docker Compose for container orchestration
Persistent volume storage for ChromaDB
Health check monitoring
Hot reload for development

System Architecture

The system uses a containerized microservices architecture designed for reliability, scalability, and ease of deployment.

Container Structure

The application runs in two primary Docker containers that communicate over an internal network.

Container 1: FastAPI Backend (Port 8000)

This container houses the core application logic including the REST API, AI agent, and vector database.

Complete REST API with endpoints for queries, file management, and knowledge base operations
LangGraph-based agent with structured workflow nodes
Integrated ChromaDB for persistent vector storage
Google Gemini API integration for language model and embeddings
Logging, metrics collection, and performance monitoring

Container 2: Chainlit Frontend (Port 8001)

This container provides the user-facing conversational interface.

Interactive chat with real-time response streaming
Visual progress tracking of agent processing steps
Knowledge base management through chat commands
HTTP communication with the FastAPI backend

Agent Architecture

The agent uses a graph-based workflow where specialized nodes handle different aspects of query processing.

graph TD
    A[User Question] --> B[Scope Validation]
    B -->|Out of Scope| C[Guidance Response]
    B -->|In Scope| D[Planner Node]
    D --> E[Retriever Node]
    E --> F[Joiner Node]
    F -->|Insufficient| D
    F -->|Sufficient| G[Output Formatter]
    G --> H[Final Response + Sources]
    
    subgraph "Automotive Specialization"
        I[Automotive Knowledge]
        J[Technical Specs]
        K[Vehicle Database]
    end
    
    D -.-> I
    E -.-> J
    G -.-> K

Data Flow

The following diagram illustrates how data moves through the system layers.

graph LR
    subgraph "Frontend Layer"
        A[Chainlit UI] --> C[Chat Interface]
        A --> D[Command Interface]
    end
    
    subgraph "API Layer"
        E[FastAPI Endpoints] --> F[Authentication]
        E --> G[Validation]
        E --> H[Streaming SSE]
    end
    
    subgraph "Agent Layer"
        I[LangGraph Workflow] --> J[Scope Validation]
        I --> K[Knowledge Retrieval]
        I --> L[Response Generation]
    end
    
    subgraph "Data Layer"
        M[ChromaDB] --> N[Vector Embeddings]
        M --> O[Document Metadata]
        P[Knowledge Files] --> Q[PDF Documents]
        P --> R[JSON Solutions]
    end
    
    subgraph "AI Services"
        S[Google Gemini] --> T[LLM Processing]
        S --> U[Embeddings Generation]
    end
    
    A --> E
    E --> I
    I --> M
    I --> S
    P --> M

Project Structure

vector-agent-search/
├── app/                          # FastAPI Main Application
│   ├── agents/                   # LangGraph Agent System
│   │   ├── agent.py              # Main agent class and workflow
│   │   └── nodes.py              # Specialized nodes (Planner, Retriever, etc.)
│   ├── api/                      # REST APIs and Endpoints
│   │   └── endpoints.py          # Complete routes (query, upload, knowledge)
│   ├── core/                     # Core Configurations
│   │   └── config.py             # Application and environment settings
│   ├── models/                   # Models and Schemas
│   │   ├── schemas.py            # Pydantic models (State, Request/Response)
│   │   ├── prompts.py            # Specialized prompt templates
│   │   ├── automotive_knowledge.py  # Automotive knowledge base
│   │   └── scope_prompts.py      # Scope validation prompts
│   ├── services/                 # Core Application Services
│   │   ├── llm_service.py        # Google Gemini (LLM + Embeddings)
│   │   ├── vector_service.py     # ChromaDB and vector operations
│   │   ├── scope_service.py      # Question scope validation
│   │   └── intelligent_scope_service.py # LLM-based intelligent validation
│   ├── evaluation/               # Evaluation System
│   │   ├── metrics.py            # Quality metrics
│   │   ├── response_evaluator.py # Response evaluation
│   │   └── retrieval_evaluator.py # Retrieval evaluation
│   ├── monitoring/               # Monitoring and Observability
│   │   ├── logger.py             # Structured logging system
│   │   ├── metrics_tracker.py    # Metrics tracking
│   │   └── performance_monitor.py # Performance monitoring
│   ├── utils/                    # Utilities and Helpers
│   │   ├── file_utils.py         # File manipulation and extraction
│   │   ├── response_utils.py     # Response formatting
│   │   └── prompt_enhancer.py    # Prompt improvement
│   ├── config/                   # Data Configurations
│   │   └── escopo.json           # Automotive scope definitions
│   └── main.py                   # FastAPI main application
├── frontend/                     # Chainlit Interface
│   └── app.py                    # Chainlit application with streaming
├── docker/                       # Docker Configurations
│   ├── Dockerfile.fastapi        # FastAPI + Agent container
│   └── Dockerfile.chainlit       # Chainlit Frontend container
├── scripts/                      # Automation Scripts
│   ├── setup_gemini.md           # Google Gemini configuration guide
│   ├── reset_chromadb.bat        # ChromaDB reset (Windows)
│   ├── reset_chromadb.sh         # ChromaDB reset (Linux/Mac)
│   └── chainlit_autoreload.py    # Development auto-reload
├── tests/                        # Automated Tests
│   ├── test_agent.py             # Main agent tests
│   ├── test_api.py               # API tests
│   ├── test_vector_service.py    # Vector service tests
│   ├── test_automotive_cases.py  # Automotive test cases
│   ├── test_evaluation.py        # Evaluation system tests
│   ├── test_monitoring.py        # Monitoring tests
│   ├── conftest.py               # Pytest configurations
│   └── run_full_validation.py    # Complete system validation
├── knowledge/                    # Knowledge Base
│   ├── files_id/                 # Technical PDF documents
│   └── direct_hits_id/           # Direct solutions (JSON)
├── Guides/                       # Usage Guides
│   ├── DOCKER_TESTING_GUIDE.md   # Docker testing guide
│   └── RESET_CHROMADB_GUIDE.md   # Vector database reset guide
├── docker-compose.yml            # Complete container orchestration
├── requirements.txt              # FastAPI dependencies
├── requirements.chainlit.txt     # Chainlit dependencies
├── start.bat / start.sh          # Startup scripts
└── README.md                     # Main documentation

Agent Workflow

The agent workflow represents the intelligent decision-making process that transforms user questions into accurate, sourced responses. This section explains how the system processes queries.

Processing Flow

When a user submits a question, it passes through a series of specialized processing nodes, each designed to handle a specific aspect of the query.

graph TD
    A[User Question] --> B[Scope Validator]
    B -->|Out of Scope| C[Guidance Response]
    B -->|In Scope| D[Planner - Strategy]
    D --> E[Retriever - Vector Search]
    E --> F[Joiner - Evaluation]
    F -->|Insufficient| D
    F -->|Sufficient| G[Output Formatter]
    G --> H[Final Response + Sources]
    
    subgraph "Automotive Specialization"
        I[Automotive Knowledge]
        J[Technical Specs DB]
        K[DTC Codes]
        L[Vehicle Context]
    end
    
    D -.-> I
    E -.-> J
    F -.-> K
    G -.-> L

Node Descriptions

Scope Validator

The Scope Validator determines whether a question falls within the automotive domain that the system is designed to answer.

Technical Details:

Uses LLM combined with a scope knowledge base for validation
Outputs include: in_scope (boolean), category (topic area), confidence (certainty level), reason (explanation)
Specialized recognition for Fiat Linea context, DTC codes, and technical procedures

Planner

The Planner analyzes validated questions and creates an optimized search strategy.

Technical Details:

Identifies relevant automotive systems (engine, electrical, fuel)
Optimizes search by document type (procedure, specification, diagnostic)
Outputs a structured plan with specific technical terms for retrieval

Retriever

The Retriever performs semantic search across the vectorized document database.

Technical Details:

Uses ChromaDB with Google Gemini Embeddings (768 dimensions)
Applies specialized filters by document type and automotive relevance
Implements diversification to avoid redundant chunks from the same source
Extracts metadata including technical specifications (torques, pressures, codes)

Joiner

The Joiner evaluates whether the retrieved documents contain sufficient information for a complete response.

Technical Details:

Uses AI to determine if more information or replanning is needed
Implements a confidence scoring system for response quality
Includes loop prevention logic to avoid infinite retrieval cycles

Output Formatter

The Output Formatter assembles the final response with automotive expertise and source citations.

Technical Details:

Formats responses with proper technical structure (procedures, specifications, diagnostics)
Creates traceable source list with file names and relevant excerpts
Includes relevant technical metadata

Real-Time Streaming

The system provides real-time feedback on processing progress through Server-Sent Events.

sequenceDiagram
    participant U as User
    participant C as Chainlit
    participant A as Agent
    participant L as LLM
    participant V as VectorDB
    
    U->>C: Question
    C->>A: Stream Request
    A->>C: Validating scope...
    A->>L: Scope validation
    A->>C: Creating plan...
    A->>L: Planning
    A->>C: Searching documents...
    A->>V: Vector search
    A->>C: Evaluating results...
    A->>L: Evaluation
    A->>C: Formatting response...
    A->>L: Final formatting
    A->>C: Complete response
    C->>U: Final result

Metrics and Monitoring

The system collects detailed metrics for each query to enable performance analysis and quality assurance.

Example metrics collected:

{
    "query_type": "automotive",
    "processing_time": "2.3s",
    "documents_retrieved": 6,
    "confidence_level": "high",
    "automotive_terms_matched": 8,
    "technical_specs_found": True,
    "sources_with_downloads": 4
}

Installation and Setup

This section provides step-by-step instructions for deploying the Vector Search Agent.

Prerequisites

Before installation, ensure the following are available:

Docker and Docker Compose
Git
Python 3.12.0 (for local development only)
Google Gemini API key

Step 1: Clone the Repository

git clone <repository-url>
cd vector-agent-search

Step 2: Configure Environment

Copy the example configuration file and customize as needed:

cp .env.example .env

Edit the .env file to set your Google Gemini API key and other configuration options. Default settings work for the Docker environment.

Step 3: Configure Google Gemini

Obtain an API key from Google AI Studio:

Visit https://aistudio.google.com/app/apikey
Create a new API key
Add the key to your .env file:

echo "GOOGLE_API_KEY=your_api_key_here" >> .env

Verify configuration:

python tests/check_gemini_config.py

Step 4: Start the Containers

# Start all services
docker-compose up -d

# Check logs to verify startup
docker-compose logs -f

Step 5: Initialize the Knowledge Base

After containers are running, load the knowledge base:

curl -X POST "http://localhost:8000/api/v1/knowledge/load"

Embeddings Configuration

This application uses Google Gemini embeddings with 768 dimensions for optimal vector quality.

If migrating from a previous embedding configuration (such as 384 dimensions), reset ChromaDB:

# Stop containers and reset volume
docker-compose down
docker volume rm vector-agent-search_chroma_data  
docker-compose up -d

# Or use the migration script
docker exec -it vector-search-fastapi bash -c "cd /app && python reset_for_gemini_embeddings.py"

Running Tests

All tests should be executed inside Docker containers to ensure consistency with the production environment.

Automated Testing (Recommended):

# Windows
run_tests_docker.bat

# Linux/Mac  
./run_tests_docker.sh

Manual Testing:

# Test Gemini embeddings
docker exec -it vector-search-fastapi bash -c "cd /app && python test_gemini_embeddings.py"

# Check configuration
docker exec -it vector-search-fastapi bash -c "cd /app && python tests/check_gemini_config.py"

# Run all tests
docker exec -it vector-search-fastapi bash -c "cd /app && python -m pytest tests/ -v"

Application Usage

This section covers day-to-day usage of the Vector Search Agent through both the web interface and REST API.

Web Interface (Chainlit)

The web interface provides an intuitive chat-based experience for interacting with the knowledge base.

Access the interface at: http://localhost:8001

Loading Documents:

Use the /carregar command to load the existing knowledge base
Documents can be added via API endpoints or by placing files in the knowledge directory

Asking Questions:

Type questions about the loaded documents in natural language
The agent processes queries in real time with visible progress indicators
View the search plan, retrieved documents, and final response

REST API (FastAPI)

The REST API enables programmatic integration with other systems and applications.

Access API documentation at: http://localhost:8000/docs

Simple Query:

curl -X POST "http://localhost:8000/api/v1/query" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is the main content of the documents?",
    "use_confidence": true
  }'

Streaming Query:

curl -N -X POST "http://localhost:8000/api/v1/query/stream" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{
    "question": "Explain the main points of the documents",
    "use_confidence": true
  }'

File Upload:

curl -X POST "http://localhost:8000/api/v1/files/upload" \
  -F "files=@document.pdf" \
  -F "files=@data.json"

Scope Validation:

curl -X POST "http://localhost:8000/api/v1/scope/check" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Question about the documents"
  }'

Knowledge Base Operations:

# Load knowledge base
curl -X POST "http://localhost:8000/api/v1/knowledge/load"

# Vectorize documents
curl -X POST "http://localhost:8000/api/v1/knowledge/vectorize"

# Check available files
curl http://localhost:8000/api/v1/files/knowledge

# Reset ChromaDB
curl -X DELETE "http://localhost:8000/api/v1/knowledge/reset"

Health Check:

curl http://localhost:8000/health

Container Management

This section provides commands for managing the Docker containers that run the application.

Startup and Basic Control

# Start all services
docker-compose up -d

# Stop all services
docker-compose down

# Restart all containers
docker-compose restart

# Check container status
docker-compose ps

# View logs from all services
docker-compose logs -f

Rebuild and Updates

# Complete rebuild with clean cache
docker-compose down
docker-compose build --no-cache --pull
docker-compose up -d

# Rebuild specific service
docker-compose up -d --build --force-recreate fastapi-agent

# Rebuild frontend only
docker-compose up -d --build --force-recreate chainlit-frontend

# Restart without rebuild
docker-compose restart fastapi-agent
docker-compose restart chainlit-frontend

Development and Debug

# Access main container shell
docker exec -it vector-search-fastapi bash

# Access frontend shell
docker exec -it vector-search-chainlit bash

# View specific service logs
docker-compose logs -f fastapi-agent
docker-compose logs -f chainlit-frontend

# Monitor resource usage
docker stats

# Execute commands in container
docker exec -it vector-search-fastapi python -c "print('Hello from container')"

Cleanup and Maintenance

# Basic cleanup (removes stopped containers)
docker-compose down
docker system prune -f

# Complete cleanup (WARNING: deletes volumes and persistent data)
docker-compose down -v --rmi all
docker system prune -a --volumes -f
docker volume prune -f

# Selective image cleanup
docker rmi $(docker images -q vector-agent-search*)

ChromaDB Management

Vector Database Reset:

# Method 1: Automated Scripts (Recommended)
# Windows
scripts\reset_chromadb.bat

# Linux/Mac
./scripts/reset_chromadb.sh

# Method 2: Via REST API
curl -X DELETE "http://localhost:8000/api/v1/knowledge/reset"

# Method 3: Docker Volume Reset
docker-compose down
docker volume rm vector-agent-search_chroma_data
docker-compose up -d

# Method 4: Local Reset (development without Docker)
# Windows PowerShell
Remove-Item -Recurse -Force .\chroma_store

# Linux/Mac
rm -rf ./chroma_store

Diagnosis and Verification:

# Check application and ChromaDB status
curl http://localhost:8000/health

# Verify persisted data
docker exec -it vector-search-fastapi ls -la /app/chroma_store

# Check collection status
docker exec -it vector-search-fastapi python -c "
from app.services.vector_service import vector_service
print('Vector DB ready:', vector_service._is_vectordb_ready())
print('Scope DB ready:', vector_service._is_scope_db_ready())
print('Statistics:', vector_service.get_automotive_statistics())
"

# Check Docker volumes
docker volume ls | grep chroma
docker volume inspect vector-agent-search_chroma_data

Data Reloading:

# Load existing knowledge base
curl -X POST "http://localhost:8000/api/v1/knowledge/load"

# Vectorize loaded documents
curl -X POST "http://localhost:8000/api/v1/knowledge/vectorize"

# Upload new documents
curl -X POST "http://localhost:8000/api/v1/files/upload" \
  -F "files=@document.pdf" \
  -F "files=@data.json"

# Check available files
curl http://localhost:8000/api/v1/files/knowledge

Volume Structure

Volume	Purpose
`chroma_data`	Persistent ChromaDB database
`./app`	Application code (mounted for development)
`./knowledge`	Knowledge base documents (mounted as volume)

Development Guide

This section provides guidance for developers working on the Vector Search Agent codebase.

Local Development Setup

For development without Docker (requires Python 3.12.0):

Install Dependencies:

# Verify Python version
python --version  # Should show Python 3.12.0

pip install -r requirements.txt
pip install -r requirements.chainlit.txt

Configure Environment:

export GOOGLE_API_KEY=your_api_key_here
export CHROMA_PERSIST_DIRECTORY=./chroma_store

Run Services:

# FastAPI
uvicorn app.main:app --reload --port 8000

# Chainlit
chainlit run frontend/app.py --port 8001

Hot Reload

The FastAPI container is configured with volume bind mount for hot reload during development. Changes to code in ./app/ are reflected automatically without container restart.

Python 3.12 Features

This application uses Python 3.12.0 features for improved performance and code clarity.

Performance Improvements:

Runtime optimizations for faster code execution
Better memory management with reduced RAM consumption
Improved bytecode cache for faster application startup

Modern Syntax:

Native type hints: list[str] instead of List[str]
Union syntax: str | None instead of Union[str, None]
Match statements for complex conditional logic

Example of Modern Syntax:

def extract_confidence_from_text(text: str) -> str | None:
    """Extract confidence using modern match syntax"""
    text_lower = text.lower()
    
    match text_lower:
        case s if "high confidence" in s:
            return "High"
        case s if "medium confidence" in s:
            return "Medium" 
        case s if "low confidence" in s:
            return "Low"
        case _:
            return None

Monitoring and Diagnostics

This section covers monitoring capabilities and diagnostic procedures for maintaining system health.

Service Logs

# View all logs
docker-compose logs

# View specific service logs
docker-compose logs fastapi-agent
docker-compose logs chainlit-frontend

# Follow logs in real-time
docker-compose logs -f

# Filtered logs
docker-compose logs fastapi-agent | grep -E "(ERROR|WARNING|INFO)"
docker-compose logs fastapi-agent | grep -E "(POST|GET|PUT|DELETE)"

Health Checks

Endpoint	Purpose
http://localhost:8000/health	FastAPI service health
http://localhost:8000/docs	API documentation

Resource Monitoring

# Real-time resource monitoring
docker stats --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}"

# Check space used by volumes
docker system df
docker volume ls --format "table {{.Name}}\t{{.Driver}}\t{{.Scope}}"

# Container performance analysis
docker exec -it vector-search-fastapi python -c "
import psutil
print(f'CPU Usage: {psutil.cpu_percent()}%')
print(f'Memory Usage: {psutil.virtual_memory().percent}%')
print(f'Disk Usage: {psutil.disk_usage(\"/\").percent}%')
"

Security Considerations

This section outlines security recommendations for production deployments.

Production Configuration

For production environments, implement the following security measures:

Area	Recommendation
CORS	Configure allowed origins in `app/main.py`
Authentication	Implement authentication on all endpoints
Rate Limiting	Add request rate limiting to prevent abuse
HTTPS	Configure SSL/TLS certificates for encrypted communication
Firewall	Restrict access to container ports

Sensitive Information

Store all sensitive configuration in environment variables:

API keys (Google Gemini, etc.)
Authentication tokens
Database connection strings
Any other credentials

Never commit sensitive information to version control.

Troubleshooting

This section addresses common issues and their solutions.

ChromaDB Dimension Error

Error Message:

ChromaDB.errors.InvalidArgumentError: Collection expecting embedding with dimension of 384, got 768

Solution:

docker-compose down
docker volume rm vector-agent-search_chroma_data
docker-compose up -d

ChromaDB Data Not Persisting

Diagnosis:

docker volume ls
docker exec -it vector-search-fastapi ls -la /app/chroma_store

Solution: Verify volume mounts are correctly configured in docker-compose.yml.

Gemini Configuration Error

Diagnosis:

python tests/check_gemini_config.py
python tests/test_gemini_integration.py
echo $GOOGLE_API_KEY

Solution: Verify API key is correctly set in environment variables.

Chainlit Cannot Connect to FastAPI

Diagnosis:

docker exec -it vector-search-chainlit curl http://fastapi-agent:8000/health

Solution: Verify both containers are on the same Docker network and FastAPI is running.

Performance Optimization

This section provides guidance for optimizing system performance.

Configuration Tuning

Parameter	Purpose	Recommendation
Embeddings	Vector dimension affects quality and speed	Use smaller models for high-volume production
CHUNK_SIZE	Document chunk size	Adjust based on document characteristics
CHUNK_OVERLAP	Overlap between chunks	Balance context preservation with storage
GEMINI_TEMPERATURE	Response creativity	Lower for more deterministic responses
GEMINI_MAX_TOKENS	Response length limit	Set based on expected response size

Resource Monitoring

# Container resource usage
docker stats

# Volume space usage
docker system df

# Cleanup unused resources
docker system prune

Contributing

We welcome contributions to improve the Vector Search Agent.

Fork the project
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License and Acknowledgments

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

This project builds upon several excellent open-source technologies:

Technology	Purpose
LangChain	Framework for LLM applications
LangGraph	Building agents with graphs
FastAPI	Modern web framework for APIs
Chainlit	Chat interface for LLM applications
ChromaDB	Vector database
Google Gemini	Large language model and embeddings

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Guides		Guides
app		app
docker		docker
frontend		frontend
knowledge		knowledge
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.chainlit.txt		requirements.chainlit.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Vector Search Agent - Automotive AI Knowledge System

Business Description

Table of Contents

Technical Overview

Core Capabilities

Technology Stack

System Architecture

Container Structure

Agent Architecture

Data Flow

Project Structure

Agent Workflow

Processing Flow

Node Descriptions

Real-Time Streaming

Metrics and Monitoring

Installation and Setup

Prerequisites

Step 1: Clone the Repository

Step 2: Configure Environment

Step 3: Configure Google Gemini

Step 4: Start the Containers

Step 5: Initialize the Knowledge Base

Embeddings Configuration

Running Tests

Application Usage

Web Interface (Chainlit)

REST API (FastAPI)

Container Management

Startup and Basic Control

Rebuild and Updates

Development and Debug

Cleanup and Maintenance

ChromaDB Management

Volume Structure

Development Guide

Local Development Setup

Hot Reload

Python 3.12 Features

Monitoring and Diagnostics

Service Logs

Health Checks

Resource Monitoring

Security Considerations

Production Configuration

Sensitive Information

Troubleshooting

ChromaDB Dimension Error

ChromaDB Data Not Persisting

Gemini Configuration Error

Chainlit Cannot Connect to FastAPI

Performance Optimization

Configuration Tuning

Resource Monitoring

Contributing

License and Acknowledgments

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages