Antara AI Documentation

Executive Summary

The Antara AI is a sophisticated conversational AI system built on the LangGraph framework, featuring persistent memory capabilities through MongoDB and a modular, extensible architecture. The agent implements four distinct memory types (Episodic, Semantic, Procedural, and Associative) using the LangMem library, enabling it to learn from interactions, remember user preferences, and apply procedural knowledge across conversations.

1. Architecture Overview

1.1 Core Principles

Stateless LLM with Persistent Memory: The underlying language model is stateless, but the agent maintains persistent memories across conversations
Modular Design: Clean separation of concerns with distinct layers for service, core logic, tools, and configuration
Memory-Driven Intelligence: Leverages multiple memory types to provide contextual, personalized responses
Tool-Based Extensibility: Easily extensible through a standardized tool interface

1.2 Technology Stack

Framework: LangGraph for state management and execution flow
Language Models: Support for both Groq (cloud) and Ollama (local) providers
Memory Storage: MongoDB with vector indexing for semantic search
Embeddings: HuggingFace embeddings (all-MiniLM-L6-v2, 384 dimensions)
Tools: Memory management tools and internet search capabilities

2. System Architecture

2.1 Layer Architecture

┌─────────────────────────────────────────────────────┐
│                 User Interface Layer                │
│            (CLI, Streamlit, Web Interface)          │
├─────────────────────────────────────────────────────┤
│                   Service Layer                     │
│                  (LTMService)                       │
├─────────────────────────────────────────────────────┤
│                    Core Layer                       │
│     Agent Logic │ Graph Builder │ State Manager     │
├─────────────────────────────────────────────────────┤
│                  Memory Layer                       │
│    Memory Manager │ MongoDB Store │ Vector Index    │
├─────────────────────────────────────────────────────┤
│                   Tools Layer                       │
│        Memory Tools │ Internet Search              │
├─────────────────────────────────────────────────────┤
│               Configuration Layer                   │
│     App Config │ Prompt Templates │ Environment     │
└─────────────────────────────────────────────────────┘

3. Core Components

3.1 LTMService (Service Layer)

Purpose: Provides a clean, high-level interface between UI components and core functionality.

Key Responsibilities:

Model initialization and configuration
User and thread management
Message processing orchestration
Error handling and logging

Key Methods:

get_model_info(): Returns current model configuration
create_user_id() / create_thread_id(): Generate unique identifiers
process_message(): Main entry point for message processing

3.2 Agent Core (Core Layer)

3.2.1 State Management (`state.py`)

class State(MessagesState):
    recall_memories: List[str]

Extends LangGraph's MessagesState
Maintains conversation messages and recalled memories
Thread-safe and serializable for checkpointing

3.2.2 Agent Logic (`agent.py`)

Core Functions:

agent(): Main processing function that combines messages with recalled memories
load_memories(): Placeholder for automatic memory loading (currently delegated to tools)
route_tools(): Determines whether to execute tools or end conversation

3.2.3 Graph Builder (`graph_builder.py`)

Creates the execution graph with the following flow:

START → load_memories → agent → [route_tools] → [tools → agent] → END

Features:

MongoDB checkpointer for persistent state
Conditional edges for tool execution
Stream-based output for real-time responses

4. Memory System Architecture

4.1 Memory Types and Schemas

4.1.1 Episodic Memory

Purpose: Captures specific learning experiences and interaction patterns

class Episode(BaseModel):
    observation: str    # Context and situation
    thoughts: str      # Agent's reasoning process
    action: str        # What was done and how
    result: str        # Outcome and analysis

Use Cases:

Learning from successful interactions
Recording problem-solving approaches
Building experience base for future reference

4.1.2 Semantic Memory

Purpose: Stores factual information as structured relationships

class Triple(BaseModel):
    subject: str       # Entity being described
    predicate: str     # Relationship or property
    object: str        # Target of relationship
    context: str       # Additional clarification

Use Cases:

User preferences and facts
Relationships between concepts
Structured knowledge representation

4.1.3 Procedural Memory

Purpose: Stores instructions, rules, and repeatable procedures

class Procedural(BaseModel):
    task: str          # Task or process name
    steps: List[str]   # Step-by-step instructions
    conditions: str    # When to apply procedure
    outcome: str       # Expected results

Use Cases:

How-to knowledge
Standard operating procedures
Rule-based decision making

4.1.4 Associative Memory (General)

Purpose: Flexible memory type for associations and mixed content

Use Cases:

Cross-domain associations
Contextual relationships
Mixed-type memory storage

4.2 Memory Storage Architecture

┌─────────────────────────────────────────────────────┐
│                MongoDB Database                     │
│                  (ltm_agent)                       │
├─────────────────────────────────────────────────────┤
│                Collections Structure                │
│                                                     │
│  memories/                                          │
│  ├── {user_id}/                                     │
│  │   ├── episodes/           (Episodic memories)    │
│  │   ├── triples/            (Semantic memories)    │
│  │   ├── procedures/         (Procedural memories)  │
│  │   └── general/            (Associative memories) │
│  └── vector_indexes/         (Vector search indexes)│
└─────────────────────────────────────────────────────┘

Key Features:

User-scoped memory isolation
Vector indexing for semantic search
HuggingFace embeddings (384 dimensions)
Automated index creation and management

4.3 Memory Tools

Each memory type has dedicated management and search tools:

Management Tools

manage_episodic_memory_tool: CRUD operations for episodes
manage_semantic_memory_tool: CRUD operations for triples
manage_procedural_memory_tool: CRUD operations for procedures
manage_general_memory_tool: CRUD operations for general memories

Search Tools

search_episodic_memory_tool: Semantic search through experiences
search_semantic_memory_tool: Search facts and relationships
search_procedural_memory_tool: Search procedures and rules
search_general_memory_tool: General associative search

5. Tools and Extensions

5.1 Available Tools

5.1.1 Internet Search

Tool: SearxSearchResults
Purpose: Web search capabilities for real-time information retrieval
Configuration: Configurable Searx host for privacy-focused search
Integration: Seamlessly integrated with agent decision-making process

5.2 Tool Integration Pattern

All tools follow a standardized integration pattern:

# Tool Registration
all_tools = [
    search_internet_tool,
    *memory_tools,  # 8 memory management tools
]

# Model Binding
model_with_tools = model.bind_tools(all_tools)

6. Configuration Management

6.1 Configuration Structure

CONFIG = {
    # Model Configuration
    "model_provider": "groq|ollama",
    "model_name": "meta-llama/llama-4-maverick-17b-128e-instruct",
    
    # Services
    "searx_host": "http://127.0.0.1:8080",
    "ollama_host": "http://localhost:11434",
    
    # Memory Configuration
    "mongodb_uri": "mongodb://localhost:27017",
    "mongodb_db": "agent-memory",
    "vector_k_results": 3,
    
    # Embedding Configuration
    "memory_index": {
        "dims": "384",
        "embed": "hf:sentence-transformers/all-MiniLM-L6-v2"
    }
}

6.2 Environment Variables

Sensitive Data: Database URIs and service endpoints stored in environment variables
Runtime Configuration: Development vs. production settings
Security: No hardcoded credentials in source code

7. Data Flow and Processing

7.1 Message Processing Flow

Input Reception: User message received through UI layer
Service Orchestration: LTMService coordinates processing
State Initialization: Current state loaded from MongoDB checkpoint
Memory Loading: Relevant memories retrieved (currently tool-driven)
Agent Processing: LLM processes messages with memory context
Tool Execution: If needed, memory tools or internet search are executed for additional functionality
Response Generation: Final response generated and returned
State Persistence: Updated state saved to MongoDB checkpoint

7.2 Memory Operations Flow

Memory Storage

Tool Invocation: Agent calls appropriate memory management tool
Schema Validation: Input validated against memory type schema
Embedding Generation: Content embedded using HuggingFace model
Database Storage: Memory stored in MongoDB with vector index
Confirmation: Success/failure status returned to agent

Memory Retrieval

Search Query: Agent calls appropriate search tool
Vector Search: Query embedded and compared against stored vectors
Similarity Ranking: Results ranked by semantic similarity
Context Assembly: Retrieved memories formatted for agent context
Response Integration: Memories integrated into agent response

8. Prompt Engineering and Behavior

8.1 System Prompt Structure

The agent's behavior is guided by a comprehensive system prompt that includes:

Identity and Capabilities: Clear definition of the agent's role and memory capabilities
Memory Guidelines: Specific instructions for using different memory types
Tool Usage Instructions: Guidance on when and how to use memory tools
Personality and Tone: Jarvis-inspired communication style
Interaction Patterns: Natural conversation flow with seamless memory integration

8.2 Memory Usage Guidelines

Proactive Memory Management: Agent actively stores important information
Context-Aware Retrieval: Search memories before responding
Personalization: Use memories to tailor responses to user preferences
Learning from Experience: Build knowledge base through episodic memory
Adaptive Behavior: Recognize and adapt to changing user needs

9. User Interface Options

9.1 Command Line Interface (CLI)

Features:
- User and thread management
- Real-time conversation
- Model configuration display
- Stream-based output

9.2 Streamlit Web Interface

Features:
- Web-based interaction
- Visual conversation history
- Configuration management
- Multi-user support

9.3 Extensibility

The service layer provides a clean interface for adding new UI implementations:

Desktop applications
Mobile interfaces
API endpoints
Chatbot integrations

10. Security and Privacy

10.1 Data Isolation

User Scoping: All memories are scoped to individual users
Thread Isolation: Conversations are isolated by thread ID
Access Control: No cross-user data access

10.2 Security Measures

Environment Variables: Sensitive data stored securely
Service Authentication: Secure authentication for search services
Input Validation: Schema-based validation for all memory operations
Error Handling: Graceful error handling without information leakage

11. Performance and Scalability

11.1 Performance Optimizations

Vector Indexing: Efficient semantic search through MongoDB vector indexes
Streaming Responses: Real-time output through LangGraph streaming
Connection Pooling: MongoDB connection optimization
Embedding Caching: Efficient embedding generation and storage

11.2 Scalability Considerations

Horizontal Scaling: MongoDB supports clustering and sharding
Model Flexibility: Support for both cloud (Groq) and local (Ollama) models
Memory Management: Configurable memory retention and cleanup
Resource Monitoring: Built-in system resource monitoring

12. Development and Deployment

12.1 Development Setup

Local MongoDB: Development with local MongoDB instance
Environment Configuration: .env file for development settings
Model Options: Support for local and cloud model providers
Core Tools: Memory management and internet search capabilities

12.2 Production Deployment

MongoDB Atlas: Cloud-based MongoDB for production
Environment Management: Production-specific configuration
Service Configuration: Secure endpoint and service management
Monitoring and Logging: Comprehensive logging and error tracking

13. Future Enhancements

13.1 Planned Features

Advanced Memory Analytics: Memory usage and effectiveness metrics
Multi-Modal Memory: Support for image and audio memories
Memory Sharing: Controlled memory sharing between users
Enhanced Search Capabilities: More sophisticated information retrieval

13.2 Architecture Evolution

Microservices: Potential evolution to microservices architecture
Event-Driven Processing: Asynchronous memory processing
Advanced AI Integration: Multi-agent collaboration capabilities
Enhanced Security: Advanced authentication and authorization

Conclusion

The AI Agent represents a sophisticated approach to building conversational AI with persistent memory capabilities. Its modular architecture, comprehensive memory system, and extensible tool framework provide a robust foundation for building intelligent, context-aware applications that can learn and adapt over time.

The combination of multiple memory types, vector-based semantic search, and a clean service-oriented architecture makes this system both powerful and maintainable, suitable for a wide range of applications from personal assistants to specialized domain experts.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
config		config
core		core
utils		utils
.env.example		.env.example
.gitignore		.gitignore
IMPROVEMENTS.md		IMPROVEMENTS.md
README.md		README.md
SECURITY.md		SECURITY.md
api.py		api.py
cli_app.py		cli_app.py
googleLiveSample2.py		googleLiveSample2.py
requirements.txt		requirements.txt
run_api.py		run_api.py
streamlit_app.py		streamlit_app.py

Zenin797/AntaraAI

Folders and files

Latest commit

History

Repository files navigation

Antara AI Documentation

Executive Summary

1. Architecture Overview

1.1 Core Principles

1.2 Technology Stack

2. System Architecture

2.1 Layer Architecture

3. Core Components

3.1 LTMService (Service Layer)

3.2 Agent Core (Core Layer)

3.2.1 State Management (state.py)

3.2.2 Agent Logic (agent.py)

3.2.3 Graph Builder (graph_builder.py)

4. Memory System Architecture

4.1 Memory Types and Schemas

4.1.1 Episodic Memory

4.1.2 Semantic Memory

4.1.3 Procedural Memory

4.1.4 Associative Memory (General)

4.2 Memory Storage Architecture

4.3 Memory Tools

Management Tools

Search Tools

5. Tools and Extensions

5.1 Available Tools

5.1.1 Internet Search

5.2 Tool Integration Pattern

6. Configuration Management

6.1 Configuration Structure

6.2 Environment Variables

7. Data Flow and Processing

7.1 Message Processing Flow

7.2 Memory Operations Flow

Memory Storage

Memory Retrieval

8. Prompt Engineering and Behavior

8.1 System Prompt Structure

8.2 Memory Usage Guidelines

9. User Interface Options

9.1 Command Line Interface (CLI)

9.2 Streamlit Web Interface

9.3 Extensibility

10. Security and Privacy

10.1 Data Isolation

10.2 Security Measures

11. Performance and Scalability

11.1 Performance Optimizations

11.2 Scalability Considerations

12. Development and Deployment

12.1 Development Setup

12.2 Production Deployment

13. Future Enhancements

13.1 Planned Features

13.2 Architecture Evolution

Conclusion

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

3.2.1 State Management (`state.py`)

3.2.2 Agent Logic (`agent.py`)

3.2.3 Graph Builder (`graph_builder.py`)

Packages