Skip to content

Rust Guide

rUv edited this page Jul 31, 2025 · 1 revision

FACT Rust Guide

Crates.io Documentation License: MIT

This guide covers the complete Rust implementation of FACT (Fast Augmented Context Tools), a high-performance context processing engine designed for AI applications.

Table of Contents

  1. Overview
  2. Installation
  3. Quick Start
  4. Core Architecture
  5. Library Usage
  6. Cognitive Templates
  7. Caching System
  8. Performance Features
  9. CLI Usage
  10. Building from Source
  11. Integration Examples
  12. Advanced Configuration
  13. Custom Templates
  14. Performance Optimization

Overview

FACT provides a robust Rust implementation with the following key features:

  • High Performance: Sub-100ms processing with intelligent caching
  • Cognitive Templates: Pre-built templates for common AI patterns
  • Smart Caching: Multi-tier caching with LRU eviction
  • Type Safety: Full Rust type safety with serde integration
  • Async First: Built on Tokio for concurrent processing
  • Memory Efficient: Optimized data structures with automatic cleanup

Key Components

  • fact_tools::Fact - Main entry point for processing
  • fact_tools::FactEngine - Core processing engine
  • fact_tools::Cache - High-performance caching layer
  • fact_tools::TemplateRegistry - Template management system
  • fact_tools::QueryProcessor - Query analysis and routing

Installation

Via Cargo

Add to your Cargo.toml:

[dependencies]
fact-tools = "1.0.0"

CLI Installation

Install the command-line tool:

cargo install fact-tools

Features

Enable optional features as needed:

[dependencies]
fact-tools = { version = "1.0.0", features = ["full"] }

Available features:

  • cli - Command-line interface
  • progress - Progress reporting with indicatif
  • color - Colored output support
  • network - HTTP client for remote data
  • full - All features enabled

Quick Start

Basic Usage

use fact_tools::Fact;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create FACT instance
    let fact = Fact::new();
    
    // Process data with built-in template
    let data = json!({
        "values": [1, 2, 3, 4, 5],
        "operation": "analyze"
    });
    
    let result = fact.process("analysis-basic", data).await?;
    println!("Result: {}", serde_json::to_string_pretty(&result)?);
    
    Ok(())
}

With Custom Configuration

use fact_tools::{Fact, FactConfig, engine::EngineConfig};
use std::time::Duration;

let config = FactConfig {
    engine_config: EngineConfig {
        timeout: Duration::from_secs(60),
        parallel: true,
        max_concurrent: 8,
        monitoring: true,
    },
    cache_size: 200 * 1024 * 1024, // 200MB
    enable_monitoring: true,
    timeout: Some(Duration::from_secs(30)),
};

let fact = Fact::with_config(config);

Core Architecture

Main Components

pub struct Fact {
    engine: FactEngine,
    cache: Arc<RwLock<Cache>>,
}

pub struct FactEngine {
    config: EngineConfig,
    registry: Arc<TemplateRegistry>,
}

pub struct Cache {
    entries: AHashMap<String, CacheEntry>,
    max_size: usize,
    current_size: usize,
    // Performance counters
    hits: u64,
    misses: u64,
    evictions: u64,
}

Processing Flow

  1. Request: Data sent to fact.process(template_id, context)
  2. Cache Check: System checks for cached results
  3. Template Lookup: Finds template in registry
  4. Processing: Executes template steps sequentially
  5. Caching: Stores result for future requests
  6. Return: Processed data returned to caller

Library Usage

Processing with Templates

use fact_tools::Fact;
use serde_json::json;

let fact = Fact::new();

// Analysis template
let analysis_data = json!({
    "values": [10, 25, 30, 45, 50],
    "metadata": {
        "source": "sensor_data",
        "timestamp": "2024-01-01T12:00:00Z"
    }
});

let result = fact.process("analysis-basic", analysis_data).await?;

Pattern Detection

let pattern_data = json!({
    "query": "What patterns exist in the sales data?",
    "data": {
        "monthly_sales": [1000, 1200, 1100, 1400, 1600, 1500],
        "categories": ["electronics", "clothing", "food"]
    }
});

let patterns = fact.process("pattern-detection", pattern_data).await?;

Data Aggregation

let numbers = json!({
    "dataset": [5.5, 10.2, 15.7, 20.1, 25.9, 30.3],
    "operation": "statistical_summary"
});

let aggregated = fact.process("data-aggregation", numbers).await?;

Error Handling

use fact_tools::{Result, FactError};

match fact.process("invalid-template", data).await {
    Ok(result) => println!("Success: {}", result),
    Err(FactError::TemplateNotFound(id)) => {
        eprintln!("Template not found: {}", id);
    },
    Err(FactError::ProcessingError(msg)) => {
        eprintln!("Processing failed: {}", msg);
    },
    Err(FactError::Timeout(duration)) => {
        eprintln!("Processing timed out after {:?}", duration);
    },
    Err(e) => eprintln!("Other error: {}", e),
}

Cognitive Templates

Built-in Templates

FACT includes four pre-configured templates:

1. Analysis Basic (analysis-basic)

Statistical and pattern analysis with data expansion.

{
  "id": "analysis-basic",
  "name": "Basic Analysis",
  "steps": [
    {"name": "normalize", "operation": {"type": "transform", "transform": "normalize"}},
    {"name": "analyze", "operation": {"type": "analyze", "analysis": "statistical"}},
    {"name": "expand", "operation": {"type": "transform", "transform": "expand"}}
  ],
  "performance": {
    "avg_execution_time_ms": 50.0,
    "memory_usage_bytes": 1048576,
    "complexity": 3
  }
}

2. Pattern Detection (pattern-detection)

Detects patterns in structured data with semantic enrichment.

{
  "id": "pattern-detection", 
  "name": "Pattern Detection",
  "steps": [
    {"name": "normalize", "operation": {"type": "transform", "transform": "normalize"}},
    {"name": "pattern-analysis", "operation": {"type": "analyze", "analysis": "pattern"}},
    {"name": "semantic-enrichment", "operation": {"type": "analyze", "analysis": "semantic"}}
  ],
  "performance": {
    "avg_execution_time_ms": 75.0,
    "memory_usage_bytes": 2097152,
    "complexity": 5
  }
}

3. Data Aggregation (data-aggregation)

Numerical data aggregation with statistical operations.

4. Quick Transform (quick-transform)

Fast data transformation optimized for caching.

Template Operations

Templates support various operations:

Transform Operations:

  • Expand - Add metadata and timestamps
  • Compress - Remove internal fields
  • Normalize - Standardize data structure

Analysis Operations:

  • Statistical - Compute statistical metrics (mean, std dev, etc.)
  • Pattern - Detect structural patterns
  • Semantic - Extract entities and concepts

Filter Operations:

  • Type(String) - Filter by data type
  • Range{min, max} - Numerical range filtering
  • Custom(String) - Custom filter expressions

Aggregation Operations:

  • Sum - Sum numerical values
  • Average - Calculate averages
  • Count - Count elements

Working with Templates

use fact_tools::templates::TemplateRegistry;

let registry = TemplateRegistry::new();

// List all templates
let template_ids = registry.list();
println!("Available templates: {:?}", template_ids);

// Get specific template
if let Some(template) = registry.get("analysis-basic") {
    println!("Template: {} - {}", template.name, template.description);
    println!("Steps: {}", template.steps.len());
}

// Search by tags
let analysis_templates = registry.search_by_tags(&["analysis".to_string()]);
println!("Found {} analysis templates", analysis_templates.len());

// Get by performance
let fast_templates = registry.get_by_performance(3); // Complexity <= 3
println!("Fast templates: {}", fast_templates.len());

Caching System

Cache Architecture

The caching system uses a high-performance LRU cache with automatic eviction:

pub struct Cache {
    entries: AHashMap<String, CacheEntry>,
    max_size: usize,
    current_size: usize,
    hits: u64,
    misses: u64,
    evictions: u64,
}

struct CacheEntry {
    value: serde_json::Value,
    size: usize,
    created_at: Instant,
    last_accessed: Instant,
    access_count: u64,
}

Cache Operations

use fact_tools::Cache;

// Create cache with 50MB capacity
let mut cache = Cache::with_capacity(50 * 1024 * 1024);

// Store value
cache.put("key".to_string(), serde_json::json!({"data": "value"}));

// Retrieve value
if let Some(value) = cache.get("key") {
    println!("Cache hit: {}", value);
}

// Get statistics
let stats = cache.stats();
println!("Hit rate: {:.2}%", stats.hit_rate * 100.0);
println!("Entries: {}, Size: {} bytes", stats.entries, stats.size_bytes);

Thread-Safe Cache

For concurrent access, use the thread-safe wrapper:

use fact_tools::cache::ThreadSafeCache;

let cache = ThreadSafeCache::new();
cache.put("key".to_string(), serde_json::json!({"shared": "data"}));

// Safe for concurrent access
let value = cache.get("key");

Cache Performance

The cache is optimized for high performance:

  • Fast lookups: O(1) hash table access
  • LRU eviction: Removes least recently used entries
  • Memory tracking: Accurate size estimation
  • Concurrent safe: Optional thread-safe wrapper

Typical performance characteristics:

  • Cache hit latency: < 25ms
  • Cache miss latency: < 100ms
  • Memory overhead: ~50 bytes per entry
  • Eviction performance: O(n) scan for LRU

Performance Features

Optimized Data Structures

FACT uses high-performance data structures:

# Key dependencies for performance
ahash = "0.8"        # Fast hashing
smallvec = "1.13"    # Stack-allocated vectors
parking_lot = "0.12" # Fast synchronization
dashmap = "5.5"      # Concurrent hash map
rayon = "1.8"        # Data parallelism

Async Processing

Built on Tokio for concurrent operations:

use futures::stream::{self, StreamExt};

// Process multiple items concurrently
let items = vec![data1, data2, data3, data4];
let results: Vec<_> = stream::iter(items)
    .map(|data| fact.process("analysis-basic", data))
    .buffer_unordered(4) // Process 4 items concurrently
    .collect()
    .await;

Memory Management

Efficient memory usage with automatic cleanup:

// Memory is automatically tracked and managed
let stats = cache.stats();
println!("Current memory usage: {} KB", stats.size_bytes / 1024);

// Automatic eviction when memory limit reached
// LRU entries are removed to make space

Benchmarking

Built-in benchmarking tools:

use std::time::Instant;

// Time processing
let start = Instant::now();
let result = fact.process("quick-transform", data).await?;
let duration = start.elapsed();

println!("Processing took: {:?}", duration);

CLI Usage

Basic Commands

# Initialize FACT configuration
fact-tools init

# Process data with template
fact-tools process --template analysis-basic --input data.json

# List available templates
fact-tools templates --detailed

# Show cache statistics
fact-tools cache

# Run performance benchmark
fact-tools benchmark --iterations 1000

Process Command

# Process JSON string directly
fact-tools process --template analysis-basic --input '{"data": [1,2,3,4,5]}'

# Process from file
fact-tools process --template pattern-detection --input data.json --output result.json

# Disable caching
fact-tools process --template quick-transform --input data.json --no-cache

Templates Command

# List all templates
fact-tools templates

# Show detailed template information
fact-tools templates --detailed

# Filter by tag
fact-tools templates --tag analysis --detailed

Benchmark Command

# Benchmark default template (quick-transform)
fact-tools benchmark --iterations 100

# Benchmark specific template
fact-tools benchmark --template analysis-basic --iterations 1000

# Verbose output
fact-tools --verbose benchmark --iterations 500

Configuration

Create configuration file:

fact-tools init

This creates fact.json:

{
  "engine_config": {
    "timeout": "30s",
    "parallel": true,
    "max_concurrent": 8,
    "monitoring": true
  },
  "cache_size": 104857600,
  "enable_monitoring": true,
  "timeout": "30s"
}

Use custom configuration:

fact-tools --config custom.json process --template analysis-basic --input data.json

Building from Source

Prerequisites

  • Rust 1.70+ (2021 edition)
  • Cargo

Clone and Build

git clone https://github.com/ruvnet/FACT.git
cd FACT/cargo-crate
cargo build --release

Run Tests

# Run all tests
cargo test

# Run specific module tests
cargo test --lib cache
cargo test --lib templates

# Run with output
cargo test -- --nocapture

Development Build

# Debug build with all features
cargo build --all-features

# Release build optimized
cargo build --release --all-features

Benchmarks

# Run Criterion benchmarks
cargo bench

# Generate HTML reports
cargo bench -- --output-format html

Examples

# Run basic example
cargo run --example basic

# Run with features
cargo run --example basic --features full

Integration Examples

Web Server Integration

use axum::{extract::Json, http::StatusCode, response::Json as ResponseJson, routing::post, Router};
use fact_tools::Fact;
use serde_json::Value;

#[tokio::main]
async fn main() {
    let fact = Fact::new();
    
    let app = Router::new()
        .route("/analyze", post(analyze_handler))
        .with_state(fact);
    
    axum::Server::bind(&"0.0.0.0:3000".parse().unwrap())
        .serve(app.into_make_service())
        .await
        .unwrap();
}

async fn analyze_handler(
    State(fact): State<Fact>,
    Json(data): Json<Value>,
) -> Result<ResponseJson<Value>, StatusCode> {
    match fact.process("analysis-basic", data).await {
        Ok(result) => Ok(ResponseJson(result)),
        Err(_) => Err(StatusCode::INTERNAL_SERVER_ERROR),
    }
}

Database Integration

use sqlx::{PgPool, Row};
use fact_tools::Fact;

async fn process_database_records(pool: &PgPool, fact: &Fact) -> Result<Vec<Value>, sqlx::Error> {
    let rows = sqlx::query("SELECT data FROM analytics_data WHERE processed = false")
        .fetch_all(pool)
        .await?;
    
    let mut results = Vec::new();
    
    for row in rows {
        let data: Value = row.get("data");
        if let Ok(processed) = fact.process("analysis-basic", data).await {
            results.push(processed);
        }
    }
    
    Ok(results)
}

Message Queue Integration

use tokio_amqp::*;
use fact_tools::Fact;

async fn process_queue_messages(fact: Fact) -> Result<(), Box<dyn std::error::Error>> {
    let connection = Connection::connect("amqp://localhost").await?;
    let channel = connection.create_channel().await?;
    
    let consumer = channel
        .create_consumer("processing_queue")
        .await?;
    
    while let Some(message) = consumer.next().await {
        let data: serde_json::Value = serde_json::from_slice(&message.data)?;
        
        match fact.process("pattern-detection", data).await {
            Ok(result) => {
                // Send to results queue
                channel.basic_publish("results_queue", result).await?;
                message.ack().await?;
            }
            Err(e) => {
                eprintln!("Processing error: {}", e);
                message.nack(false).await?;
            }
        }
    }
    
    Ok(())
}

AI/ML Pipeline Integration

use fact_tools::Fact;
use candle_core::Device;

struct MLPipeline {
    fact: Fact,
    device: Device,
}

impl MLPipeline {
    async fn preprocess(&self, raw_data: Value) -> Result<Value, Box<dyn std::error::Error>> {
        // Use FACT for preprocessing
        let preprocessed = self.fact.process("pattern-detection", raw_data).await?;
        Ok(preprocessed)
    }
    
    async fn postprocess(&self, model_output: Value) -> Result<Value, Box<dyn std::error::Error>> {
        // Use FACT for postprocessing
        let postprocessed = self.fact.process("data-aggregation", model_output).await?;
        Ok(postprocessed)
    }
    
    async fn full_pipeline(&self, input: Value) -> Result<Value, Box<dyn std::error::Error>> {
        // Preprocess with FACT
        let preprocessed = self.preprocess(input).await?;
        
        // Run ML model (simplified)
        let model_output = self.run_model(preprocessed).await?;
        
        // Postprocess with FACT
        let final_result = self.postprocess(model_output).await?;
        
        Ok(final_result)
    }
    
    async fn run_model(&self, input: Value) -> Result<Value, Box<dyn std::error::Error>> {
        // Placeholder for actual ML model inference
        Ok(serde_json::json!({"prediction": "example", "confidence": 0.95}))
    }
}

Advanced Configuration

Custom Engine Configuration

use fact_tools::{Fact, FactConfig, engine::EngineConfig};
use std::time::Duration;

let engine_config = EngineConfig {
    timeout: Duration::from_secs(120),
    parallel: true,
    max_concurrent: 16,
    monitoring: true,
};

let config = FactConfig {
    engine_config,
    cache_size: 500 * 1024 * 1024, // 500MB
    enable_monitoring: true,
    timeout: Some(Duration::from_secs(60)),
};

let fact = Fact::with_config(config);

Processing Options

use fact_tools::engine::{ProcessingOptions, Priority};
use std::time::Duration;

let options = ProcessingOptions {
    timeout: Some(Duration::from_secs(10)),
    no_cache: false,
    priority: Priority::High,
};

let result = fact.process_with_options("analysis-basic", data, options).await?;

Custom Query Processor

use fact_tools::processor::{QueryProcessor, ProcessingStrategy, StrategyHandler};

let mut processor = QueryProcessor::new();

// Register custom strategy
processor.register_strategy(
    "custom-analysis".to_string(),
    ProcessingStrategy {
        name: "custom-analysis".to_string(),
        pattern: r"analyze|examine".to_string(),
        handler: StrategyHandler::Custom(|query| {
            serde_json::json!({
                "query": query,
                "type": "custom",
                "result": "Custom analysis performed"
            })
        }),
    },
);

let result = processor.process("Please analyze this data");

Custom Templates

Template Builder

use fact_tools::templates::{TemplateBuilder, ProcessingStep, Operation, Transform, Analysis};

let template = TemplateBuilder::new("custom-analytics")
    .name("Custom Analytics Template")
    .description("Performs custom data analytics")
    .add_tag("analytics")
    .add_tag("custom")
    .add_step(ProcessingStep {
        name: "normalize-data".to_string(),
        operation: Operation::Transform(Transform::Normalize),
    })
    .add_step(ProcessingStep {
        name: "statistical-analysis".to_string(),
        operation: Operation::Analyze(Analysis::Statistical),
    })
    .add_step(ProcessingStep {
        name: "pattern-detection".to_string(),
        operation: Operation::Analyze(Analysis::Pattern),
    })
    .build();

// Register with engine
let registry = TemplateRegistry::new();
registry.register(template);

Complex Template Example

use fact_tools::engine::{Filter, Aggregation};

let advanced_template = TemplateBuilder::new("market-analysis")
    .name("Market Analysis Template")
    .description("Comprehensive market data analysis")
    .add_tag("finance")
    .add_tag("market")
    .add_tag("analytics")
    // Step 1: Filter valid price data
    .add_step(ProcessingStep {
        name: "filter-prices".to_string(),
        operation: Operation::Filter(Filter::Range { min: 0.0, max: 100000.0 }),
    })
    // Step 2: Statistical analysis
    .add_step(ProcessingStep {
        name: "price-statistics".to_string(),
        operation: Operation::Analyze(Analysis::Statistical),
    })
    // Step 3: Pattern detection
    .add_step(ProcessingStep {
        name: "trend-patterns".to_string(),
        operation: Operation::Analyze(Analysis::Pattern),
    })
    // Step 4: Data aggregation
    .add_step(ProcessingStep {
        name: "aggregate-metrics".to_string(),
        operation: Operation::Aggregate(Aggregation::Average),
    })
    // Step 5: Final transformation
    .add_step(ProcessingStep {
        name: "expand-results".to_string(),
        operation: Operation::Transform(Transform::Expand),
    })
    .build();

Template Serialization

Templates can be saved and loaded as JSON:

// Save template to file
let template = TemplateBuilder::new("my-template").build();
let json = serde_json::to_string_pretty(&template)?;
std::fs::write("my_template.json", json)?;

// Load template from file
let json = std::fs::read_to_string("my_template.json")?;
let template: Template = serde_json::from_str(&json)?;

// Register loaded template
registry.register(template);

Performance Optimization

Benchmarking

use criterion::{criterion_group, criterion_main, Criterion};
use fact_tools::Fact;

fn benchmark_processing(c: &mut Criterion) {
    let rt = tokio::runtime::Runtime::new().unwrap();
    let fact = Fact::new();
    let data = serde_json::json!({"values": (0..1000).collect::<Vec<_>>()});
    
    c.bench_function("analysis-basic", |b| {
        b.iter(|| {
            rt.block_on(fact.process("analysis-basic", data.clone()))
        })
    });
}

criterion_group!(benches, benchmark_processing);
criterion_main!(benches);

Memory Optimization

// Configure cache size based on available memory
let available_memory = sys_info::mem_info().unwrap().total * 1024;
let cache_size = (available_memory / 4) as usize; // Use 25% of system memory

let config = FactConfig {
    cache_size,
    ..Default::default()
};

let fact = Fact::with_config(config);

Concurrent Processing

use rayon::prelude::*;

// Process multiple items in parallel
let items: Vec<Value> = load_data_items();
let results: Vec<_> = items
    .par_iter()
    .map(|item| {
        // Each thread gets its own FACT instance
        let fact = Fact::new();
        tokio::runtime::Handle::current()
            .block_on(fact.process("analysis-basic", item.clone()))
    })
    .collect();

Profile-Guided Optimization

Build with optimizations for production:

[profile.release]
opt-level = 3
lto = "fat"
codegen-units = 1
panic = "abort"
strip = true

Runtime Tuning

// Tune for specific workloads
let config = EngineConfig {
    timeout: Duration::from_secs(5),    // Short timeout for latency
    parallel: true,                     // Enable parallelism
    max_concurrent: num_cpus::get() * 2, // 2x thread pool
    monitoring: false,                  // Disable for production
};

This comprehensive guide covers all aspects of using FACT's Rust implementation, from basic usage to advanced optimization techniques. The library provides a robust foundation for high-performance context processing in AI applications.

Clone this wiki locally