Krkn-AI 🧬⚡

Caution

The tool is currently in under active development, use it at your own risk.

An intelligent chaos engineering framework that uses genetic algorithms to optimize chaos scenarios for Kubernetes/OpenShift applications. Krkn-AI automatically evolves and discovers the most effective chaos experiments to test your system's resilience.

🌟 Features

Genetic Algorithm Optimization: Automatically evolves chaos scenarios to find optimal testing strategies
Multi-Scenario Support: Pod failures, container scenarios, node resource exhaustion, and application outages
Kubernetes/OpenShift Integration: Native support for both platforms
Health Monitoring: Continuous monitoring of application health during chaos experiments
Prometheus Integration: Metrics-driven fitness evaluation
Configurable Fitness Functions: Point-based and range-based fitness evaluation
Population Evolution: Maintains and evolves populations of chaos scenarios across generations

🚀 Getting Started

Prerequisites

krknctl
Python 3.9+
uv package manager (recommended) or pip
podman
helm
Kubernetes cluster access file (kubeconfig)

Setup Virtual Environment

# Install uv if you haven't already
pip install uv

# Create and activate virtual environment
uv venv --python 3.9
source .venv/bin/activate

# Install Krkn-AI in development mode
uv pip install -e .

# Check Installation
uv run krkn_ai --help

Deploy Sample Microservice

For demonstration purposes, deploy the robot-shop microservice:

export DEMO_NAMESPACE=robot-shop
export IS_OPENSHIFT=true
#set IS_OPENSHIFT=false for kubernetes cluster
./scripts/setup-demo-microservice.sh

# Set context to the demo namespace
oc config set-context --current --namespace=$DEMO_NAMESPACE
# or for kubectl:
# kubectl config set-context --current --namespace=$DEMO_NAMESPACE

Setup Monitoring and Testing

# Setup NGINX reverse proxy for external access
./scripts/setup-nginx.sh

# Test application endpoints
./scripts/test-nginx-routes.sh

export HOST="http://$(kubectl get service rs -o json | jq -r '.status.loadBalancer.ingress[0].hostname')"

📝 Generate Configuration

Krkn-AI uses YAML configuration files to define experiments. You can generate a sample config file dynamically by running Krkn-AI discover command.

uv run krkn_ai discover -k ./tmp/kubeconfig.yaml \
  -n "robot-shop" \
  -pl "service" \
  -nl "kubernetes.io/hostname" \
  -o ./tmp/krkn-ai.yaml \
  --skip-pod-name "nginx-proxy.*"

Pattern Syntax for Filtering

The -n (namespace), -pl (pod-label), -nl (node-label), and --skip-pod-name options support flexible pattern matching:

Pattern	Description
`robot-shop`	Match exactly "robot-shop"
`robot-shop,default`	Match "robot-shop" OR "default"
`openshift-.*`	Regex: match namespaces starting with "openshift-"
`*`	Match all
`!kube-system`	Match all EXCEPT "kube-system"
`,!kube-.`	Match all except kube-* namespaces
`openshift-.*,!openshift-operators`	Match openshift-* but exclude operators

Examples:

# Discover in all namespaces except kube-system and openshift-*
uv run krkn_ai discover -k ./tmp/kubeconfig.yaml \
  -n "!kube-system,!openshift-.*" \
  -o ./tmp/krkn-ai.yaml

# Discover in openshift namespaces but exclude operators
uv run krkn_ai discover -k ./tmp/kubeconfig.yaml \
  -n "openshift-.*,!openshift-operators" \
  -o ./tmp/krkn-ai.yaml

# Path to your kubeconfig file
kubeconfig_file_path: "./tmp/kubeconfig.yaml"

# Optional: Random seed for reproducible runs
# seed: 42

# Genetic algorithm parameters
generations: 5
population_size: 10
composition_rate: 0.3
population_injection_rate: 0.1

# Uncomment the line below to enable runs by duration instead of generation count
# duration: 600

# Duration to wait before running next scenario (seconds)
wait_duration: 30

# Elasticsearch configuration for storing run results (Optional)
elastic:
  enable: false  # Set to true to enable Elasticsearch integration
  verify_certs: true  # Verify SSL certificates
  server: "http://localhost"  # Elasticsearch URL
  port: 9200  # Elasticsearch port
  username: "$ES_USER"  # Elasticsearch username
  password: "$__ES_PASSWORD"  # Elasticsearch password (start param with __ to treat as private)
  index: "krkn-ai"  # Index prefix for storing Krkn-AI config and results

# Specify how result filenames are formatted
output:
  result_name_fmt: "scenario_%s.yaml"
  graph_name_fmt: "scenario_%s.png"
  log_name_fmt: "scenario_%s.log"

# Fitness function configuration
fitness_function:
  query: 'sum(kube_pod_container_status_restarts_total{namespace="robot-shop"})'
  type: point  # or 'range'
  include_krkn_failure: true

# Health endpoints to monitor
health_checks:
  stop_watcher_on_failure: false
  applications:
  - name: cart
    url: "$HOST/cart/add/1/Watson/1"
  - name: catalogue
    url: "$HOST/catalogue/categories"

# Chaos scenarios to evolve
scenario:
  pod-scenarios:
    enable: true
  application-outages:
    enable: false
  container-scenarios:
    enable: false
  node-cpu-hog:
    enable: false
  node-memory-hog:
    enable: false
  kubevirt-outage:
    enable: false

# Cluster components to consider for Krkn-AI testing
cluster_components:
  namespaces:
  - name: robot-shop
    pods:
    - containers:
      - name: cart
      labels:
        service: cart
      name: cart-7cd6c77dbf-j4gsv
    - containers:
      - name: catalogue
      labels:
        service: catalogue
      name: catalogue-94df6b9b-pjgsr
  nodes:
  - labels:
      kubernetes.io/hostname: node-1
    name: node-1
  - labels:
      kubernetes.io/hostname: node-2
    name: node-2

You can modify krkn-ai.yaml as per your requirement to include/exclude any cluster components, scenarios, fitness function SLOs or health check endpoints for the Krkn-AI testing.

🎯 Usage

Basic Usage

# Configure custom Prometheus Querier endpoint and token
export PROMETHEUS_URL='https://your-prometheus-url'
export PROMETHEUS_TOKEN='your-prometheus-token'

# Configure elastic search properties (optional)
export ES_USER="elasticsearch-username"
export __ES_PASSWORD="elasticsearch-password"

# Run Krkn-AI
uv run krkn_ai run \
  -c ./tmp/krkn-ai.yaml \
  -o ./tmp/results/ \
  -p HOST=$HOST \
  -p ES_USER=$ES_USER -p __ES_PASSWORD=$__ES_PASSWORD

CLI Options

$ uv run krkn_ai discover --help
Usage: krkn_ai discover [OPTIONS]

  Discover components for Krkn-AI tests

Options:
  -k, --kubeconfig TEXT   Path to cluster kubeconfig file.
  -o, --output TEXT       Path to save config file.
  -n, --namespace TEXT    Namespace(s) to discover components in. Supports
                          Regex and comma separated values.
  -pl, --pod-label TEXT   Pod Label Keys(s) to filter. Supports Regex and
                          comma separated values.
  -nl, --node-label TEXT  Node Label Keys(s) to filter. Supports Regex and
                          comma separated values.
  -v, --verbose           Increase verbosity of output.
  --skip-pod-name TEXT    Pod name to skip. Supports comma separated values
                          with regex.
  --help                  Show this message and exit.



$ uv run krkn_ai run --help
Usage: krkn_ai run [OPTIONS]

  Run Krkn-AI tests

Options:
  -c, --config TEXT               Path to Krkn-AI config file.
  -o, --output TEXT               Directory to save results.
  -f, --format [json|yaml]        Format of the output file.
  -r, --runner-type [krknctl|krknhub]
                                  Type of krkn engine to use.
  -p, --param TEXT                Additional parameters for config file in
                                  key=value format.
  -v, --verbose                   Increase verbosity of output.
  --help                          Show this message and exit.

Note: You can also run Krkn-AI as a container with Podman or on Kubernetes. See container instructions.

Understanding Results

Krkn-AI saves results in the specified output directory:

.
└── results/
    ├── reports/
    │   ├── health_check_report.csv
    │   └── graphs/
    │       ├── best_generation.png
    │       ├── scenario_1.png
    │       ├── scenario_2.png
    │       └── ...
    ├── yaml/
    │   ├── generation_0/
    │   │   ├── scenario_1.yaml
    │   │   ├── scenario_2.yaml
    │   │   └── ...
    │   └── generation_1/
    │       └── ...
    ├── log/
    │   ├── scenario_1.log
    │   ├── scenario_2.log
    │   └── ...
    ├── best_scenarios.json
    └── config.yaml

🧬 How It Works

The current version of Krkn-AI leverages an evolutionary algorithm, an optimization technique that uses heuristics to identify chaos scenarios and components that impact the stability of your cluster and applications.

Initial Population: Creates random chaos scenarios based on your configuration
Fitness Evaluation: Runs each scenario and measures system response using Prometheus metrics
Selection: Identifies the most effective scenarios based on fitness scores
Evolution: Creates new scenarios through crossover and mutation
Health Monitoring: Continuously monitors application health during experiments
Iteration: Repeats the process across multiple generations to find optimal scenarios

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes and run the static checks (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Static Checks

Developers should run the project's static checks locally before committing. Below are recommended commands and notes for common environments (PowerShell / Bash).

Install tooling used for checks:

# Activate Virtual Environment
source .venv/bin/activate

# Install dev requirement
uv pip install -r requirements-dev.txt

Install Git hooks (runs once per developer):

pre-commit install
pre-commit autoupdate

Run all pre-commit hooks against the repository (fast, recommended):

pre-commit run --all-files

Run individual tools directly:

# Ruff (linter/formatter)
ruff check .
ruff format .

# Mypy (type checking)
mypy --config-file mypy.ini krkn_ai

# Hadolint (Dockerfile/Containerfile linting) - Docker must be available
hadolint containers/Containerfile

Notes:

The pre-commit configuration runs ruff, various file checks, and hadolint for container files. If hadolint fails with a Docker error, ensure Docker Desktop/daemon is running on your machine (the hook needs to query Docker to validate containerfile context).
Use pre-commit run --all-files to validate changes before pushing. CI will also run these checks.

Name		Name	Last commit message	Last commit date
Latest commit History 279 Commits
.github		.github
containers		containers
docs		docs
krkn_ai		krkn_ai
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Krkn-AI 🧬⚡

🌟 Features

🚀 Getting Started

Prerequisites

Setup Virtual Environment

Deploy Sample Microservice

Setup Monitoring and Testing

📝 Generate Configuration

Pattern Syntax for Filtering

🎯 Usage

Basic Usage

CLI Options

Understanding Results

🧬 How It Works

🤝 Contributing

Static Checks

About

Uh oh!

Releases 2

Packages

Contributors 8

Uh oh!

Languages

License

krkn-chaos/krkn-ai

Folders and files

Latest commit

History

Repository files navigation

Krkn-AI 🧬⚡

🌟 Features

🚀 Getting Started

Prerequisites

Setup Virtual Environment

Deploy Sample Microservice

Setup Monitoring and Testing

📝 Generate Configuration

Pattern Syntax for Filtering

🎯 Usage

Basic Usage

CLI Options

Understanding Results

🧬 How It Works

🤝 Contributing

Static Checks

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 8

Uh oh!

Languages

Packages