Skip to content

239x1a3242-maker/gakrai

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AutoBot - ReAct Agent with Local LLM

Python 3.9+ License: MIT GitHub Issues

AutoBot is a sophisticated ReAct (Reason + Act) agent system powered by local LLMs, designed to provide intelligent, context-aware responses through semantic search and tool integration. It combines reasoning, action execution, and memory management into a cohesive autonomous agent framework.

πŸš€ Key Features

  • 🧠 ReAct Agent Architecture: Implements Reason + Act pattern with iterative reasoning loops
  • πŸ” Semantic Search: FAISS-based vector similarity search with ChromaDB support
  • πŸ’Ύ 3-Tier Memory System: Short-term, working, and long-term memory with episodic storage
  • πŸ› οΈ Tool Integration: Web search pipeline with extensible tool registry
  • πŸ“š Document Processing: Multi-format ingestion (PDF, DOCX, HTML, CSV, Code files)
  • πŸ€– Local LLM Support: LFM2.5-1.2B-Instruct with GGUF quantization
  • ⚑ Parallel Processing: Multi-threaded document ingestion and processing
  • 🎯 Quality Filtering: Intelligent content scoring and deduplication

πŸ“‹ Table of Contents

πŸ—οΈ Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    USER INPUT (Natural Language)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚  ReAct ORCHESTRATOR        β”‚
         β”‚  (Reason + Act Pattern)    β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚  LLM INTERFACE                  β”‚
     β”‚  (LFM2.5-1.2B-Instruct)        β”‚
     β”‚  - GGUF Support                 β”‚
     β”‚  - Chat Templates               β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚  TOOL REGISTRY                  β”‚
     β”‚  - Web Search                   β”‚
     β”‚  - Extensible Framework         β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚  MEMORY MANAGER                 β”‚
     β”‚  - Short-term (Session)         β”‚
     β”‚  - Working (Shared State)       β”‚
     β”‚  - Long-term (Persistent)       β”‚
     β”‚  - Vector Store (FAISS/Chroma)  β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

ReAct Flow

  1. User Input β†’ ReAct Orchestrator
  2. First Pass: LLM analyzes query and decides if tools are needed
  3. Tool Execution: If needed, executes web search or other tools
  4. Second Pass: LLM generates final answer grounded in tool results
  5. Memory Storage: Conversation stored in multi-tier memory system

πŸ› οΈ Installation

Prerequisites

  • Python 3.9 or higher
  • 8GB+ RAM (recommended for local LLM)
  • CUDA-compatible GPU (optional, for faster inference)

Step 1: Clone Repository

git clone https://github.com/gajjalaashok75-UI/gakrai.git
cd gakrai

Step 2: Create Virtual Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

Step 4: Download Model (Optional)

The model will be downloaded automatically on first run, or you can pre-download:

# The LFM2.5-1.2B-Instruct model will be downloaded to ./models/
# Approximately 1.2GB download

Step 5: Initialize Configuration

# Configuration is automatically loaded from config/settings.yaml
# Customize settings if needed (see Configuration section)

πŸš€ Quick Start

Basic Usage

import asyncio
from core.react_orchestrator import ReActOrchestrator
import yaml

# Load configuration
with open('config/settings.yaml', 'r') as f:
    config = yaml.safe_load(f)

# Initialize orchestrator
orchestrator = ReActOrchestrator(config)

async def main():
    # Initialize all components
    await orchestrator.initialize()
    
    # Ask a question
    response = await orchestrator.handle_input(
        "What are the latest developments in artificial intelligence?"
    )
    
    print("Response:", response.response)
    print("Steps taken:", response.total_steps)
    print("Execution time:", f"{response.execution_time:.2f}s")

# Run the example
asyncio.run(main())

Interactive CLI

# Run the interactive command-line interface
python main.py

# Example interaction:
> What is machine learning?
[AutoBot analyzes the question, searches for current information, and provides a comprehensive answer]

> history
[Shows conversation history]

> clear
[Clears session memory]

Command Line Options

# Run with ctransformers demo
python main.py --ctransformers-demo "Explain quantum computing"

# Custom configuration
python main.py --config custom_config.yaml

πŸ’‘ Usage Examples

Example 1: Web Search Integration

# AutoBot automatically determines when to search the web
response = await orchestrator.handle_input(
    "What are the current Python 3.12 features?"
)
# AutoBot will:
# 1. Recognize this needs current information
# 2. Execute web search
# 3. Analyze results
# 4. Provide comprehensive answer with sources

Example 2: Document Ingestion and RAG

from memory.ingestion_pipeline import AdvancedIngestionPipeline

# Ingest documents into vector store
pipeline = AdvancedIngestionPipeline(
    store_path="./memory/vector_store",
    min_quality_score=0.4
)

# Process documents
stats = pipeline.ingest(
    input_dirs=["./documents/pdfs", "./documents/code"],
    max_workers=4
)

print(f"Indexed {stats['chunks_indexed']} chunks from {stats['new_files']} files")

# Now AutoBot can answer questions about your documents
response = await orchestrator.handle_input(
    "Based on my documents, explain the main concepts"
)

Example 3: Memory and Context

# AutoBot maintains conversation context
await orchestrator.handle_input("What is Python?")
await orchestrator.handle_input("What are its main advantages?")  # Refers to Python
await orchestrator.handle_input("Show me some code examples")     # Still about Python

# Access conversation history
history = await orchestrator.memory.get_recent_interactions(limit=10)
for interaction in history:
    print(f"Q: {interaction['user_input']}")
    print(f"A: {interaction['response'][:100]}...")

Example 4: Tool Integration

# AutoBot can be extended with custom tools
from tools.tool_registry import ToolRegistry

# The tool registry is extensible - add your own tools
# Tools are automatically detected and used by the ReAct agent

βš™οΈ Configuration

Main Configuration (config/settings.yaml)

assistant:
  name: "AutoBot"
  version: "0.2.0"

llm:
  intent_model:
    name: "LFM2.5-1.2B-Instruct-Q5_K_M"
    local_path: "./models/LFM2.5-1.2B-Instruct-Q5_K_M.gguf"
    max_tokens: 2048
    temperature: 0.3

memory:
  short_term_limit: 100
  long_term_db: "./memory/long_term.db"
  short_term_db: "./memory/short_term.db"
  vector_store: "./memory/vector_store"

tools:
  enabled:
    - "web_search"

agentic:
  react_max_steps: 8
  max_context_length: 32768
  max_tokens_hard_limit: 4096

debug:
  enabled: true

Environment Variables

# Optional environment variables
export RAG_VECTOR_STORE_PATH="./memory/vector_store"
export RAG_EMBEDDING_MODEL="BAAI/bge-large-en-v1.5"
export RAG_MIN_QUALITY="0.4"
export RAG_WORKERS="4"

Performance Tuning

# Adjust for your hardware
performance:
  cache_ttl: 300
  batch_size: 5

# Memory settings
memory:
  short_term_limit: 50    # Reduce for lower memory usage
  
# LLM settings
llm:
  intent_model:
    max_tokens: 1024      # Reduce for faster responses
    temperature: 0.1      # Lower for more deterministic responses

πŸ“ Project Structure

autobot/
β”œβ”€β”€ πŸ“„ main.py                          # Entry point & CLI interface
β”œβ”€β”€ πŸ“ core/                            # Core agent logic
β”‚   β”œβ”€β”€ 🧠 react_orchestrator.py        # ReAct agent implementation
β”‚   β”œβ”€β”€ πŸ€– llm_interface.py             # Local LLM management
β”‚   └── πŸ“„ __init__.py
β”œβ”€β”€ πŸ“ memory/                          # Memory & knowledge management
β”‚   β”œβ”€β”€ πŸ’Ύ memory_manager.py            # 3-tier memory system
β”‚   β”œβ”€β”€ πŸ“š ingestion_pipeline.py        # Document processing
β”‚   β”œβ”€β”€ πŸ” rag_pipeline.py              # Retrieval-augmented generation
β”‚   β”œβ”€β”€ πŸ“– INGESTION_PIPELINE.md        # Detailed ingestion docs
β”‚   β”œβ”€β”€ πŸ“– RAG_PIPELINE.md              # Detailed RAG docs
β”‚   β”œβ”€β”€ πŸ—„οΈ long_term.db                 # Persistent memory
β”‚   β”œβ”€β”€ πŸ—„οΈ short_term.db                # Session memory
β”‚   └── πŸ“ vector_store/                # FAISS/ChromaDB index
β”œβ”€β”€ πŸ“ tools/                           # Tool integrations
β”‚   β”œβ”€β”€ πŸ”§ tool_registry.py             # Tool management
β”‚   β”œβ”€β”€ πŸ” tool_detector.py             # Parse tool calls from LLM
β”‚   └── πŸ“ web_search/                  # Web search implementation
β”‚       β”œβ”€β”€ 🌐 search.py                # Main search pipeline
β”‚       β”œβ”€β”€ ⚑ quick_scrape.py          # Search execution
β”‚       └── 🧹 main_content_cleaner.py  # Content extraction
β”œβ”€β”€ πŸ“ models/                          # LLM model management
β”‚   β”œβ”€β”€ πŸ“₯ load-autobot-instruct.py     # Model loading utilities
β”‚   β”œβ”€β”€ βš™οΈ generate-autobot-instruct.py # Generation logic
β”‚   └── πŸ€– LFM2.5-1.2B-Instruct-Q5_K_M.gguf  # Model weights (downloaded)
β”œβ”€β”€ πŸ“ config/                          # Configuration
β”‚   └── βš™οΈ settings.yaml                # Main configuration file
β”œβ”€β”€ πŸ“ logs/                            # Application logs
β”‚   └── πŸ“„ autobot.log                  # Execution logs
└── πŸ“„ requirements.txt                 # Python dependencies

🧩 Components

1. ReAct Orchestrator

  • Purpose: Implements the Reason + Act pattern for intelligent decision making
  • Features: Multi-step reasoning, tool integration, conversation management
  • Models: Uses LFM2.5-1.2B-Instruct for reasoning and action decisions

2. LLM Interface

  • Purpose: Manages local model loading and inference
  • Support: GGUF via ctransformers (CPU) and transformers (GPU)
  • Features: Chat templates, bfloat16 precision, streaming generation

3. Memory Manager

  • Short-term: Session-level interactions (in-memory + SQLite)
  • Working: Shared state for current reasoning loops
  • Long-term: Persistent SQLite with semantic search via ChromaDB
  • Features: Episodic memory, semantic memory, conversation history

4. Tool Registry

  • Current Tools: Web search (DuckDuckGo-based)
  • Architecture: Extensible framework for adding new tools
  • Features: Automatic tool detection, parallel execution, retry logic

5. Document Ingestion Pipeline

  • Formats: PDF, DOCX, HTML, CSV, Code files (Python, JS, Java), TXT, Markdown
  • Features: Adaptive chunking, quality scoring, deduplication, parallel processing
  • Output: FAISS vector index with metadata for RAG retrieval

6. RAG Pipeline

  • Search: FAISS-based semantic similarity search
  • Features: Metadata filtering, source deduplication, context building
  • Integration: Works seamlessly with ReAct agent for knowledge retrieval

πŸ“š API Reference

ReActOrchestrator

class ReActOrchestrator:
    async def initialize() -> bool
    async def handle_input(user_input: str) -> ReActResult
    async def get_conversation_history() -> List[Dict]
    async def clear_session_memory() -> bool

Memory Manager

class MemoryManager:
    async def store_interaction(user_input: str, response: str, intent: str)
    async def get_recent_interactions(limit: int = 10) -> List[Dict]
    async def search_memories(query: str, limit: int = 5) -> List[Dict]
    async def flush_short_to_long_term() -> Dict

Tool Registry

class ToolRegistry:
    async def execute_tool(tool_name: str, **kwargs) -> Dict
    def get_available_tools() -> List[Dict]
    def register_tool(name: str, function: Callable, schema: Dict)

Document Ingestion

class AdvancedIngestionPipeline:
    def ingest(input_dirs: List[str], max_workers: int = 4) -> Dict
    def process_file(file_path: str) -> int
    def get_stats() -> Dict

RAG Pipeline

class RAGPipeline:
    def query(query: str, top_k: int = 3, temperature: float = 0.1) -> Dict
    def retrieve_context(query: str, top_k: int = 5) -> Tuple[RAGContext, List[RAGResult]]
    def get_stats() -> Dict

πŸš€ Performance

Typical Performance Metrics

  • Model Size: 1.2B parameters (quantized to ~800MB)
  • Inference Speed: 1-5 seconds per response
  • Memory Usage: 4-8 GB (including model and search indices)
  • Search Latency: 100-500ms for semantic search
  • Web Search: 2-10 seconds depending on query complexity

Optimization Tips

  1. GPU Acceleration: Use CUDA for 3-5x faster inference
  2. Memory Management: Adjust short_term_limit for memory constraints
  3. Parallel Processing: Use max_workers=4-8 for document ingestion
  4. Quality Filtering: Set min_quality_score=0.6+ for better results
  5. Context Length: Reduce max_context_length for faster responses

Hardware Requirements

Minimum:

  • 8GB RAM
  • 4-core CPU
  • 2GB storage

Recommended:

  • 16GB RAM
  • 8-core CPU
  • NVIDIA GPU with 4GB+ VRAM
  • 10GB storage

Optimal:

  • 32GB RAM
  • 16-core CPU
  • NVIDIA GPU with 8GB+ VRAM
  • SSD storage

πŸ”§ Development

Adding Custom Tools

# 1. Create tool function
async def my_custom_tool(param1: str, param2: int) -> Dict:
    # Your tool logic here
    return {"result": "success", "data": "..."}

# 2. Register tool
tool_schema = {
    "name": "my_custom_tool",
    "description": "Description of what the tool does",
    "parameters": {
        "type": "object",
        "properties": {
            "param1": {"type": "string", "description": "Parameter 1"},
            "param2": {"type": "integer", "description": "Parameter 2"}
        },
        "required": ["param1", "param2"]
    }
}

# 3. Add to tool registry
tool_registry.register_tool("my_custom_tool", my_custom_tool, tool_schema)

Extending Memory System

# Custom memory backend
class CustomMemoryBackend:
    async def store(self, key: str, value: Any):
        # Custom storage logic
        pass
    
    async def retrieve(self, key: str) -> Any:
        # Custom retrieval logic
        pass

# Integrate with memory manager
memory_manager.add_backend("custom", CustomMemoryBackend())

Model Integration

# Support for custom models
class CustomLLMInterface(LLMInterface):
    def _load_model(self):
        # Load your custom model
        pass
    
    async def generate(self, messages: List[Dict]) -> str:
        # Custom generation logic
        pass

🀝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

Development Setup

# Clone repository
git clone https://github.com/gajjalaashok75-UI/gakrai.git
cd gakrai

# Create development environment
python -m venv dev-env
source dev-env/bin/activate  # or dev-env\Scripts\activate on Windows

# Install development dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt  # If available

# Run tests
python -m pytest tests/

# Run linting
flake8 .
black .

Contribution Areas

  • πŸ› οΈ Tool Development: Add new tools (email, calendar, databases, APIs)
  • 🧠 Model Integration: Support for new LLM models and providers
  • πŸ“š Document Formats: Add support for new file formats
  • πŸ” Search Improvements: Enhanced semantic search and ranking
  • 🎨 UI/UX: Web interface, mobile app, desktop GUI
  • πŸ“Š Analytics: Usage metrics, performance monitoring
  • πŸ”’ Security: Authentication, authorization, data privacy

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • LFM2.5: Local language model for reasoning and generation
  • FAISS: Efficient similarity search and clustering
  • ChromaDB: Vector database for semantic search
  • LangChain/LangGraph: Agent orchestration framework
  • Transformers: HuggingFace model integration
  • DuckDuckGo: Privacy-focused web search

πŸ“ž Support

πŸ—ΊοΈ Roadmap

  • Web Interface: React-based web UI
  • API Server: REST API for external integrations
  • Plugin System: Dynamic tool loading
  • Multi-Modal: Image and audio processing
  • Distributed: Multi-agent collaboration
  • Cloud Integration: AWS/Azure/GCP deployment
  • Mobile App: iOS/Android applications

AutoBot - Intelligent automation through local AI reasoning and action. Built with ❀️ for developers and researchers. by Gakr team

About

AutoBot is a sophisticated ReAct (Reason + Act) agent system powered by local LLMs, designed to provide intelligent, context-aware responses through semantic search and tool integration. It combines reasoning, action execution, and memory management into a cohesive autonomous agent framework.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%