Skip to content

openshift-online/rosa-log-router

Repository files navigation

Multi-Tenant Logging Pipeline

A proof-of-concept implementation of a scalable, cost-effective multi-tenant logging pipeline on AWS that implements "Centralized Ingestion, Decentralized Delivery" architecture.

πŸš€ What This Does

  • Collects logs from Kubernetes/OpenShift clusters using Vector agents
  • Stores logs centrally in S3 with intelligent compression and partitioning
  • Delivers logs to multiple customer AWS accounts simultaneously
  • Supports multiple delivery types per tenant (CloudWatch Logs + S3)
  • Reduces costs by ~90% compared to direct CloudWatch Logs ingestion

πŸ—οΈ Architecture Overview

graph LR
    K8s[Kubernetes Clusters] --> Vector[Vector Agents]
    Vector --> S3[Central S3 Storage]
    S3 --> SNS[Event Processing]
    SNS --> Lambda[Log Processor]
    Lambda --> CW1[Customer 1<br/>CloudWatch Logs]
    Lambda --> CW2[Customer 2<br/>CloudWatch Logs]
    Lambda --> S3_1[Customer 1<br/>S3 Bucket]
    Lambda --> S3_2[Customer 2<br/>S3 Bucket]
Loading

Key Benefits:

  • Multi-Delivery: Each tenant can receive logs via CloudWatch Logs AND S3 simultaneously
  • Direct S3 Writes: Eliminates Kinesis Firehose costs (~$50/TB saved)
  • Cross-Account Security: Secure delivery using IAM role assumption
  • Container-Based Processing: Modern Lambda functions using ECR containers

πŸ“š Documentation

Quick Start

Component Guides

🎯 Quick Start

Prerequisites for Local Development

  • Podman for container builds and LocalStack
  • Go 1.21+ for log processor development
  • Terraform for infrastructure as code
  • Make for development workflow automation
  • kubectl (optional, for cluster deployments)

1. Local Development with LocalStack

# Start LocalStack
make start

# Build the log processor container
make build

# Deploy infrastructure to LocalStack
make deploy

# Run integration tests
make test-e2e

# View all available commands
make help

2. Deploy Vector to Kubernetes

# Create logging namespace
kubectl create namespace logging

# Deploy Vector collector (OpenShift with specific overlay)
kubectl apply -k k8s/collector/overlays/cuppett

# Verify deployment
kubectl get pods -n logging

3. Configure Tenants (LocalStack)

# Tenant configurations are automatically created by Terraform
# View tenant configs in LocalStack
TABLE_NAME=$(cd terraform/local && terraform output -raw central_dynamodb_table)
aws --endpoint-url=http://localhost:4566 dynamodb scan --table-name $TABLE_NAME

πŸ“– Complete Deployment Guide | πŸ”§ Development Guide

πŸ”§ Development

Local Testing with Make

# View all available commands
make help

# Full workflow: start LocalStack, build, deploy, test
make start
make build
make deploy
make test-e2e

# Run processor in scan mode
make run-scan

# Validate Vector log flow
make validate-vector-flow

Container Architecture

  • Collector Container: Vector binary for log collection
  • Processor Container: Go-based processor with multi-stage build
  • Multi-Mode Support: Lambda, scan mode, and manual testing

πŸ’» Full Development Guide | πŸ“¦ Makefile Reference

πŸŽ›οΈ Current Capabilities

βœ… Implemented Features

  • Vector log collection with namespace filtering and intelligent parsing
  • Direct S3 storage with GZIP compression and dynamic partitioning
  • Multi-delivery support - CloudWatch Logs + S3 per tenant
  • Application filtering with individual apps and pre-defined groups (API, Authentication, etc.)
  • Container-based Lambda processing with ECR images
  • Cross-account security via double-hop role assumption
  • Cost optimization with S3 lifecycle policies and compression
  • Development tools with fake log generator and local testing
  • API management for tenant configuration via REST API

🚧 Proof-of-Concept Limitations

  • Basic monitoring - AWS native services only (no custom metrics/dashboards)
  • Simple error handling - DLQ and retry logic without advanced workflow
  • Regional deployment - Manual multi-region setup required
  • Minimal UI - Configuration via API/CLI only

πŸ“Š Performance & Costs

Estimated Monthly Costs (1TB logs)

  • This Pipeline: ~$50/month (S3 + Lambda + supporting services)
  • Direct CloudWatch: ~$500/month (ingestion costs)
  • Kinesis Firehose: ~$100/month (additional processing costs)

Performance Characteristics

  • Throughput: ~20,000 events/second per cluster node
  • Latency: ~2-5 minutes from log generation to delivery
  • Compression: ~30:1 ratio with GZIP
  • Scalability: Horizontal scaling via multiple processor instances

πŸ”’ Security Model

  • Namespace Isolation: Vector only collects from labeled namespaces
  • Cross-Account Access: Customer roles with ExternalId validation
  • Encryption: SSE-S3/KMS encryption for all data at rest
  • Least Privilege: Minimal IAM permissions with resource restrictions
  • Audit Trail: All role assumptions logged in CloudTrail

🀝 Contributing

  1. Check Development Guide for local setup
  2. Review Architecture Design for system understanding
  3. Test changes in development environment first
  4. Submit pull requests with detailed descriptions

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ—οΈ POC Status: This project demonstrates core functionality with minimal complexity. Advanced monitoring, alerting, and management features should be added incrementally after pipeline validation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 9