Skip to content

NehharShah/astra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Astra: Agentic Infrastructure

Scalable agent runtime demonstrating system design, resilience patterns, and production best practices.

🎯 Key Achievements

  • Circuit Breaker Pattern: Prevents cascading failures with automatic recovery
  • Exponential Backoff Retry: Resilient operations with configurable retry logic
  • Rate Limiting: Token bucket algorithm for API throttling (500 req/min)
  • Caching Layer: In-memory cache with TTL for performance optimization
  • Distributed Tracing: Correlation IDs and structured logging for observability
  • Graceful Shutdown: Proper resource cleanup and signal handling
  • Comprehensive Testing: 95%+ test coverage with pytest
  • Performance Benchmarking: Automated load testing suite

🏗️ Architecture

┌──────────────────────────────────────────────────────┐
│             FastAPI + Middleware Layer               │
│  (Correlation ID │ Logging │ Rate Limiting)          │
└────────────┬─────────────────────────────────────────┘
             │
    ┌────────▼────────┐
    │  Orchestrator   │
    │  (Async Routing)│
    └────────┬────────┘
             │
    ┌────────┴───────────────────┐
    │                            │
┌───▼────────┐          ┌───────▼────────┐
│ LangGraph  │          │   CrewAI       │
│ Workflow   │          │ Multi-Agent    │
│            │          │ (Parallel)     │
└───┬────────┘          └───────┬────────┘
    │                           │
    └─────────┬─────────────────┘
              │
    ┌─────────▼──────────┐
    │ Resilience Layer   │
    │ - Circuit Breaker  │
    │ - Retry Logic      │
    │ - Rate Limiter     │
    │ - Cache Manager    │
    └─────────┬──────────┘
              │
    ┌─────────▼──────────┐
    │   Memory Manager   │
    │  (Stateful Store)  │
    └────────────────────┘

🚀 Quick Start

# Install dependencies
pip install -r requirements.txt

# Run tests
pytest tests/ -v --cov=src

# Start server
python main.py

# Run benchmarks (in another terminal)
python benchmark.py

# Run demo
python examples/demo.py

📂 Project Structure

R/
├── src/                    # Core modules
│   ├── agents.py           # Agent orchestration with retry/circuit breaker
│   ├── memory.py           # Stateful memory management
│   ├── inference.py        # Inference layer
│   ├── telemetry.py        # Metrics and observability
│   ├── resilience.py       # Circuit breaker, retry, rate limit, cache
│   ├── middleware.py       # Logging, correlation IDs, rate limiting
│   └── config.py           # Configuration management
├── tests/                  # Comprehensive test suite
│   ├── test_resilience.py  # Resilience pattern tests
│   ├── test_agents.py      # Agent functionality tests
│   ├── test_memory.py      # Memory management tests
│   └── conftest.py         # Pytest fixtures
├── examples/
│   └── demo.py             # Interactive demo
├── main.py                 # FastAPI server with middleware
├── benchmark.py            # Performance benchmarking suite
└── requirements.txt        # Minimal dependencies

🎯 Patterns Implemented

1. Resilience & Reliability

  • Circuit Breaker: Auto-recovery from failures (3 failures → open, 60s recovery)
  • Retry with Exponential Backoff: Configurable retry logic with jitter
  • Graceful Degradation: Fail-safe mechanisms throughout

2. Performance & Scalability

  • Rate Limiting: Token bucket algorithm (500 req/min) prevents overload
  • Caching: In-memory cache with TTL reduces redundant computation
  • Async/Parallel Execution: CrewAI agents run concurrently
  • Connection Pooling: Efficient resource utilization

3. Observability

  • Structured Logging: JSON logs with contextual information
  • Correlation IDs: End-to-end request tracing
  • Metrics Collection: Real-time performance metrics
  • Health Checks: Liveness and readiness probes

4. Production Readiness

  • Graceful Shutdown: SIGTERM/SIGINT handling with cleanup
  • Error Handling: Comprehensive exception handling with context
  • Configuration Management: Environment-based settings
  • API Versioning: Semantic versioning support

5. Testing & Quality

  • Unit Tests: 95%+ code coverage
  • Integration Tests: End-to-end testing
  • Performance Benchmarks: Automated load testing
  • Test Fixtures: Reusable test components

📊 API Endpoints

Endpoint Method Description Features
/api/agent/execute POST Execute agent task Retry, Circuit Breaker, Caching
/api/memory/store POST Store in memory TTL support
/api/memory/retrieve POST Retrieve from memory Pagination
/api/metrics GET System metrics Real-time stats
/health GET Health check Readiness probe
/docs GET OpenAPI docs Interactive API

🧪 Testing

# Run all tests with coverage
pytest tests/ -v --cov=src --cov-report=html

# Run specific test suite
pytest tests/test_resilience.py -v

# Run with markers
pytest tests/ -v -m asyncio

Test Coverage

  • Resilience Patterns: Circuit breaker, retry, rate limiting
  • Agent Functionality: LangGraph, CrewAI orchestration
  • Memory Operations: Short-term, long-term, concurrent access
  • Error Scenarios: Failure handling, recovery

⚡ Performance

Benchmark results on M1 Mac (8GB RAM):

Operation Avg Latency P95 P99 Throughput
Health Check 2.88ms 6.27ms 7.80ms 347 req/s
Agent Execute 56.34ms 61.01ms 75.34ms 17.75 req/s
Memory Store 3.51ms 7.07ms 12.23ms 284 req/s
Concurrent (10) 124.33ms - - 48.52 req/s

🔒 Security & Reliability

  • Rate Limiting: Prevents DoS attacks (500 req/min default)
  • Input Validation: Pydantic models for request validation
  • Error Sanitization: No sensitive data in error responses
  • Correlation IDs: Audit trail for all requests

📈 Monitoring

Structured logs include:

  • Request/response timing
  • Error rates and types
  • Memory usage statistics
  • Cache hit/miss rates
  • Circuit breaker state changes

🛠️ Configuration

# src/config.py
class Settings:
    host: str = "0.0.0.0"
    port: int = 8000
    max_concurrent_agents: int = 10
    agent_timeout: int = 300

🎓 Design Decisions

Why Circuit Breaker?

Prevents cascading failures when inference layer is slow/down. Automatically recovers without manual intervention.

Why Token Bucket Rate Limiting?

Smooth traffic distribution vs hard limits. Allows burst traffic while maintaining average rate.

Why Correlation IDs?

Essential for distributed tracing. Links all operations in a request chain for debugging.

Why Structured Logging?

Machine-parseable logs enable better alerting and analytics. Critical for production systems.

📄 License

MIT License - Feel free to use for portfolio/learning


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages