Skip to content

Zenin797/AntaraAI

Repository files navigation

Antara AI Documentation

Executive Summary

The Antara AI is a sophisticated conversational AI system built on the LangGraph framework, featuring persistent memory capabilities through MongoDB and a modular, extensible architecture. The agent implements four distinct memory types (Episodic, Semantic, Procedural, and Associative) using the LangMem library, enabling it to learn from interactions, remember user preferences, and apply procedural knowledge across conversations.

1. Architecture Overview

1.1 Core Principles

  • Stateless LLM with Persistent Memory: The underlying language model is stateless, but the agent maintains persistent memories across conversations
  • Modular Design: Clean separation of concerns with distinct layers for service, core logic, tools, and configuration
  • Memory-Driven Intelligence: Leverages multiple memory types to provide contextual, personalized responses
  • Tool-Based Extensibility: Easily extensible through a standardized tool interface

1.2 Technology Stack

  • Framework: LangGraph for state management and execution flow
  • Language Models: Support for both Groq (cloud) and Ollama (local) providers
  • Memory Storage: MongoDB with vector indexing for semantic search
  • Embeddings: HuggingFace embeddings (all-MiniLM-L6-v2, 384 dimensions)
  • Tools: Memory management tools and internet search capabilities

2. System Architecture

2.1 Layer Architecture

┌─────────────────────────────────────────────────────┐
│                 User Interface Layer                │
│            (CLI, Streamlit, Web Interface)          │
├─────────────────────────────────────────────────────┤
│                   Service Layer                     │
│                  (LTMService)                       │
├─────────────────────────────────────────────────────┤
│                    Core Layer                       │
│     Agent Logic │ Graph Builder │ State Manager     │
├─────────────────────────────────────────────────────┤
│                  Memory Layer                       │
│    Memory Manager │ MongoDB Store │ Vector Index    │
├─────────────────────────────────────────────────────┤
│                   Tools Layer                       │
│        Memory Tools │ Internet Search              │
├─────────────────────────────────────────────────────┤
│               Configuration Layer                   │
│     App Config │ Prompt Templates │ Environment     │
└─────────────────────────────────────────────────────┘

3. Core Components

3.1 LTMService (Service Layer)

Purpose: Provides a clean, high-level interface between UI components and core functionality.

Key Responsibilities:

  • Model initialization and configuration
  • User and thread management
  • Message processing orchestration
  • Error handling and logging

Key Methods:

  • get_model_info(): Returns current model configuration
  • create_user_id() / create_thread_id(): Generate unique identifiers
  • process_message(): Main entry point for message processing

3.2 Agent Core (Core Layer)

3.2.1 State Management (state.py)

class State(MessagesState):
    recall_memories: List[str]
  • Extends LangGraph's MessagesState
  • Maintains conversation messages and recalled memories
  • Thread-safe and serializable for checkpointing

3.2.2 Agent Logic (agent.py)

Core Functions:

  • agent(): Main processing function that combines messages with recalled memories
  • load_memories(): Placeholder for automatic memory loading (currently delegated to tools)
  • route_tools(): Determines whether to execute tools or end conversation

3.2.3 Graph Builder (graph_builder.py)

Creates the execution graph with the following flow:

START → load_memories → agent → [route_tools] → [tools → agent] → END

Features:

  • MongoDB checkpointer for persistent state
  • Conditional edges for tool execution
  • Stream-based output for real-time responses

4. Memory System Architecture

4.1 Memory Types and Schemas

4.1.1 Episodic Memory

Purpose: Captures specific learning experiences and interaction patterns

class Episode(BaseModel):
    observation: str    # Context and situation
    thoughts: str      # Agent's reasoning process
    action: str        # What was done and how
    result: str        # Outcome and analysis

Use Cases:

  • Learning from successful interactions
  • Recording problem-solving approaches
  • Building experience base for future reference

4.1.2 Semantic Memory

Purpose: Stores factual information as structured relationships

class Triple(BaseModel):
    subject: str       # Entity being described
    predicate: str     # Relationship or property
    object: str        # Target of relationship
    context: str       # Additional clarification

Use Cases:

  • User preferences and facts
  • Relationships between concepts
  • Structured knowledge representation

4.1.3 Procedural Memory

Purpose: Stores instructions, rules, and repeatable procedures

class Procedural(BaseModel):
    task: str          # Task or process name
    steps: List[str]   # Step-by-step instructions
    conditions: str    # When to apply procedure
    outcome: str       # Expected results

Use Cases:

  • How-to knowledge
  • Standard operating procedures
  • Rule-based decision making

4.1.4 Associative Memory (General)

Purpose: Flexible memory type for associations and mixed content

Use Cases:

  • Cross-domain associations
  • Contextual relationships
  • Mixed-type memory storage

4.2 Memory Storage Architecture

┌─────────────────────────────────────────────────────┐
│                MongoDB Database                     │
│                  (ltm_agent)                       │
├─────────────────────────────────────────────────────┤
│                Collections Structure                │
│                                                     │
│  memories/                                          │
│  ├── {user_id}/                                     │
│  │   ├── episodes/           (Episodic memories)    │
│  │   ├── triples/            (Semantic memories)    │
│  │   ├── procedures/         (Procedural memories)  │
│  │   └── general/            (Associative memories) │
│  └── vector_indexes/         (Vector search indexes)│
└─────────────────────────────────────────────────────┘

Key Features:

  • User-scoped memory isolation
  • Vector indexing for semantic search
  • HuggingFace embeddings (384 dimensions)
  • Automated index creation and management

4.3 Memory Tools

Each memory type has dedicated management and search tools:

Management Tools

  • manage_episodic_memory_tool: CRUD operations for episodes
  • manage_semantic_memory_tool: CRUD operations for triples
  • manage_procedural_memory_tool: CRUD operations for procedures
  • manage_general_memory_tool: CRUD operations for general memories

Search Tools

  • search_episodic_memory_tool: Semantic search through experiences
  • search_semantic_memory_tool: Search facts and relationships
  • search_procedural_memory_tool: Search procedures and rules
  • search_general_memory_tool: General associative search

5. Tools and Extensions

5.1 Available Tools

5.1.1 Internet Search

  • Tool: SearxSearchResults
  • Purpose: Web search capabilities for real-time information retrieval
  • Configuration: Configurable Searx host for privacy-focused search
  • Integration: Seamlessly integrated with agent decision-making process

5.2 Tool Integration Pattern

All tools follow a standardized integration pattern:

# Tool Registration
all_tools = [
    search_internet_tool,
    *memory_tools,  # 8 memory management tools
]

# Model Binding
model_with_tools = model.bind_tools(all_tools)

6. Configuration Management

6.1 Configuration Structure

CONFIG = {
    # Model Configuration
    "model_provider": "groq|ollama",
    "model_name": "meta-llama/llama-4-maverick-17b-128e-instruct",
    
    # Services
    "searx_host": "http://127.0.0.1:8080",
    "ollama_host": "http://localhost:11434",
    
    # Memory Configuration
    "mongodb_uri": "mongodb://localhost:27017",
    "mongodb_db": "agent-memory",
    "vector_k_results": 3,
    
    # Embedding Configuration
    "memory_index": {
        "dims": "384",
        "embed": "hf:sentence-transformers/all-MiniLM-L6-v2"
    }
}

6.2 Environment Variables

  • Sensitive Data: Database URIs and service endpoints stored in environment variables
  • Runtime Configuration: Development vs. production settings
  • Security: No hardcoded credentials in source code

7. Data Flow and Processing

7.1 Message Processing Flow

  1. Input Reception: User message received through UI layer
  2. Service Orchestration: LTMService coordinates processing
  3. State Initialization: Current state loaded from MongoDB checkpoint
  4. Memory Loading: Relevant memories retrieved (currently tool-driven)
  5. Agent Processing: LLM processes messages with memory context
  6. Tool Execution: If needed, memory tools or internet search are executed for additional functionality
  7. Response Generation: Final response generated and returned
  8. State Persistence: Updated state saved to MongoDB checkpoint

7.2 Memory Operations Flow

Memory Storage

  1. Tool Invocation: Agent calls appropriate memory management tool
  2. Schema Validation: Input validated against memory type schema
  3. Embedding Generation: Content embedded using HuggingFace model
  4. Database Storage: Memory stored in MongoDB with vector index
  5. Confirmation: Success/failure status returned to agent

Memory Retrieval

  1. Search Query: Agent calls appropriate search tool
  2. Vector Search: Query embedded and compared against stored vectors
  3. Similarity Ranking: Results ranked by semantic similarity
  4. Context Assembly: Retrieved memories formatted for agent context
  5. Response Integration: Memories integrated into agent response

8. Prompt Engineering and Behavior

8.1 System Prompt Structure

The agent's behavior is guided by a comprehensive system prompt that includes:

  • Identity and Capabilities: Clear definition of the agent's role and memory capabilities
  • Memory Guidelines: Specific instructions for using different memory types
  • Tool Usage Instructions: Guidance on when and how to use memory tools
  • Personality and Tone: Jarvis-inspired communication style
  • Interaction Patterns: Natural conversation flow with seamless memory integration

8.2 Memory Usage Guidelines

  1. Proactive Memory Management: Agent actively stores important information
  2. Context-Aware Retrieval: Search memories before responding
  3. Personalization: Use memories to tailor responses to user preferences
  4. Learning from Experience: Build knowledge base through episodic memory
  5. Adaptive Behavior: Recognize and adapt to changing user needs

9. User Interface Options

9.1 Command Line Interface (CLI)

  • Features:
    • User and thread management
    • Real-time conversation
    • Model configuration display
    • Stream-based output

9.2 Streamlit Web Interface

  • Features:
    • Web-based interaction
    • Visual conversation history
    • Configuration management
    • Multi-user support

9.3 Extensibility

The service layer provides a clean interface for adding new UI implementations:

  • Desktop applications
  • Mobile interfaces
  • API endpoints
  • Chatbot integrations

10. Security and Privacy

10.1 Data Isolation

  • User Scoping: All memories are scoped to individual users
  • Thread Isolation: Conversations are isolated by thread ID
  • Access Control: No cross-user data access

10.2 Security Measures

  • Environment Variables: Sensitive data stored securely
  • Service Authentication: Secure authentication for search services
  • Input Validation: Schema-based validation for all memory operations
  • Error Handling: Graceful error handling without information leakage

11. Performance and Scalability

11.1 Performance Optimizations

  • Vector Indexing: Efficient semantic search through MongoDB vector indexes
  • Streaming Responses: Real-time output through LangGraph streaming
  • Connection Pooling: MongoDB connection optimization
  • Embedding Caching: Efficient embedding generation and storage

11.2 Scalability Considerations

  • Horizontal Scaling: MongoDB supports clustering and sharding
  • Model Flexibility: Support for both cloud (Groq) and local (Ollama) models
  • Memory Management: Configurable memory retention and cleanup
  • Resource Monitoring: Built-in system resource monitoring

12. Development and Deployment

12.1 Development Setup

  • Local MongoDB: Development with local MongoDB instance
  • Environment Configuration: .env file for development settings
  • Model Options: Support for local and cloud model providers
  • Core Tools: Memory management and internet search capabilities

12.2 Production Deployment

  • MongoDB Atlas: Cloud-based MongoDB for production
  • Environment Management: Production-specific configuration
  • Service Configuration: Secure endpoint and service management
  • Monitoring and Logging: Comprehensive logging and error tracking

13. Future Enhancements

13.1 Planned Features

  • Advanced Memory Analytics: Memory usage and effectiveness metrics
  • Multi-Modal Memory: Support for image and audio memories
  • Memory Sharing: Controlled memory sharing between users
  • Enhanced Search Capabilities: More sophisticated information retrieval

13.2 Architecture Evolution

  • Microservices: Potential evolution to microservices architecture
  • Event-Driven Processing: Asynchronous memory processing
  • Advanced AI Integration: Multi-agent collaboration capabilities
  • Enhanced Security: Advanced authentication and authorization

Conclusion

The AI Agent represents a sophisticated approach to building conversational AI with persistent memory capabilities. Its modular architecture, comprehensive memory system, and extensible tool framework provide a robust foundation for building intelligent, context-aware applications that can learn and adapt over time.

The combination of multiple memory types, vector-based semantic search, and a clean service-oriented architecture makes this system both powerful and maintainable, suitable for a wide range of applications from personal assistants to specialized domain experts.

About

No description, website, or topics provided.

Resources

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages