📊 Kalm Benchmark

KALM provides a comprehensive, standardized benchmark for evaluating and comparing Kubernetes security scanners. The benchmark consists of two components:

235+ intentionally vulnerable Kubernetes manifests covering 12 major security categories that scanners should detect
Interactive web UI for analyzing scanner performance, accuracy, and coverage with CCSS alignment scoring

⚠️ This product is not officially supported by Dynatrace.

Description

Benchmark Manifests

KALM provides a comprehensive test suite of 235+ Kubernetes manifests specifically designed to evaluate security scanner effectiveness. Each manifest represents a specific security misconfiguration or vulnerability pattern that scanners should detect.

Key characteristics of the benchmark manifests:

Intentionally vulnerable: Each manifest contains a specific security issue (privileged containers, exposed secrets, RBAC misconfigurations, etc.)
Single-issue focus: One manifest tests one security check to enable precise scanner comparison
Comprehensive coverage: Tests span 12 major security categories:
- Pod Security: Privilege escalation, host access, security contexts
- RBAC: Excessive permissions, cluster-admin usage, service account issues
- Network Policies: Traffic isolation, metadata API access
- Resource Management: CPU/memory limits, resource quotas
- Container Security: Image policies, capabilities, read-only filesystems
- Secrets & ConfigMaps: Sensitive data exposure
- Namespaces: Default namespace usage, system namespace access
- Pod Security Standards: PSA configuration issues
- Supply Chain: Image tags, registry security
- Workload Types: Naked pods, reliability configurations
- Network Security: Ingress configurations, TLS settings
- Infrastructure: Storage, reliability, node selection

Structured for evaluation: Each manifest includes metadata annotations specifying:

Expected scanner result (alert or pass)
Check description and security impact
Specific configuration paths that should be flagged
Unique check IDs for result correlation

This design enables precise measurement of scanner accuracy, false positive rates, and coverage across different security domains.

📋 Complete catalog: Benchmark Checks (235+ individual security tests)

Web UI

The web application provides multiple analysis views:

an overview of various scanners checked with this benchmark
an analysis page to inspect the results of a specific scanner in more detail
a CCSS alignment page to compare scanner performance against standardized scoring

Key Features:

Modular architecture: Dedicated analysis modules for helm scanning, benchmark comparison, and CCSS alignment
Helm analytics: Per-chart deployment analysis, security profiles, and interactive filtering
Interactive visualizations: Charts, pivot tables, and grouped security comparisons
Unified database: SQLite backend with automatic result persistence
Real-time monitoring: Live scan progress updates and centralized logging
Settings management: Configurable data directories and display options

CCSS Integration

The benchmark now includes CCSS (Common Configuration Scoring System) integration for scanner analysis:

Scanner Alignment Analysis: Compare how different scanners align with standardized CCSS scores
Multi-Source Support: Evaluate scanners against Kubernetes manifests, live API servers, and Helm charts
Research Capabilities: Designed to support large-scale evaluation (e.g., top 100+ Helm charts from Artifactory)
Flexible Configuration: Supports any number of charts, mixed source types, and custom evaluation criteria
Data Models: Extended data structures for comprehensive misconfiguration analysis

Key features:

Interactive alignment visualizations and scanner rankings
Category-specific performance analysis
Statistical correlation between native scanner scores and CCSS scores
Database persistence for evaluation runs and findings
Backward compatibility with existing KALM functionality

Use Cases

For Security and DevOps Teams:

Scanner Evaluation: Compare 12+ security scanners across 235+ real-world vulnerability patterns
Tool Selection: Identify scanners with best coverage for your specific security requirements
Custom Rule Development: Use benchmark results to develop and validate custom security policies
Compliance Assessment: Evaluate scanner alignment with industry standards (CCSS scoring)
Performance Benchmarking: Measure scanner accuracy, false positive rates, and detection coverage

For Scanner Developers & Vendors:

Competitive Analysis: Compare your tool against market alternatives using standardized tests
Quality Assurance: Identify detection gaps, false positives, and rule conflicts across security categories
Product Development: Use benchmark feedback to improve check accuracy and coverage
Standards Alignment: Optimize scanner output to align with CCSS and industry scoring standards
Regression Testing: Validate that updates don't break existing detection capabilities

Prerequisites

Python >= 3.10
The manifests are generated using cdk8s, which depends on nodeJS
- Please ensure nodeJS is installed on your system
Any scanner for which a scan should be triggered must be installed manually
- 📖 See the comprehensive Scanner Installation Guide for detailed setup instructions
Poetry is used to manage the project itself

Getting Started

1) 🔨 Installation

To install the benchmark along with its dependencies listed in the pyproject.toml execute:

poetry install

2) 🏄‍♀️ Usage

To use this package run the CLI with the appropriate command:

poetry run cli <command>

For detailed information of the available commands or general help run:

poetry run cli --help

2.1) Generating manifests

To generate manifests use the generate command:

poetry run cli generate [--out <target directory>]

These manifests form the basis for the benchmark and will be placed in the directory specified with the --out argument. The location defaults to the manifests folder in the working directory.

2.2) Start the Web UI

Besides the CLI commands the tool also provides a web user interface to manage the scan(s) and analyse evaluation results. It can be started with the command:

poetry run cli serve

The web UI includes:

Settings Panel: Configure data directory and display options
Automatic Result Saving: Scan results are automatically saved to unified database
Centralized Logging: View scan logs and UI activity in organized log files
Real-time Updates: Monitor scan progress with live status updates
Database Backend: SQLite-based data storage

2.3) Perform a scan with a Scanner

To scan either a cluster or manifest files with the specified tool use the scan command. Use either the -c flag to specify the target cluster or the -f flag to specify the target file/folder.

poetry run cli scan <tool> [-c | -f <target file or folder>]

❗️ Important executing a scan requires the respective tool to be installed on the system!

📋 Supported Scanners: Kubescape, KubeLinter, KICS, Trivy, Checkov, Polaris, Terrascan, Kube-score, Snyk, Kubesec, Kube-bench, KubiScan

🔧 Quick Setup: For detailed installation instructions for all scanners, see the Scanner Installation Guide

E.g., to scan manifests with the tool dummy located in the manifests folder execute:

poetry run cli scan dummy -f manifests

In order to save the results, add the -o flag with the name of the destination folder:

poetry run cli scan <tool> [-c | -f <target file or folder>] -o <output-folder>

2.4) Evaluate a Scanner

To evaluate a scanner, first run a scan to generate results in the database, then use the evaluate command:

poetry run cli evaluate <tool>

You can also evaluate a specific scan run:

poetry run cli evaluate <tool> --run-id <scan_run_id>

2.5) Database Management

The benchmark uses a unified SQLite database for CCSS integration:

View database statistics:

poetry run cli db-stats

🚀 Deployment

Some scanners only scan resources deployed in a Kubernetes cluster. You can find instructions on how to deploy the benchmark in a cluster here

Scanner Requirements Summary

Scanner Type	Requirements	Examples
Manifest-based	Scanner binary + YAML files	Kubescape, KICS, Trivy, Polaris
Cluster-based	Scanner binary + Running K8s cluster	Kube-bench, KubiScan
API Key required	Scanner binary + External service token	Snyk, Checkov (for full features)

📖 Detailed setup instructions: Scanner Installation Guide

Scanner Features & Severity Support

KALM supports 12 security scanners and provides comprehensive severity information extraction from all of them. Each scanner has different severity formats and coverage levels.

📊 Complete severity support matrix: Scanner Installation Guide - Scanner Coverage

Tool-specific considerations

Some scanners have special requirements or focus areas:

kube-bench: Focuses on infrastructure security (CIS Kubernetes Benchmark) rather than workload security
KubiScan: Requires special setup as it's distributed as a Python script
Snyk/Checkov: Require API keys for full functionality

📖 Complete scanner details and setup instructions: Scanner Installation Guide

Troubleshooting

Scanner Issues

For comprehensive troubleshooting of scanner installation, configuration, and execution issues, see the Scanner Installation Guide - Troubleshooting Section.

Docker-based Issues

ensure the -t flag is not used in the command. If it is, stdout and stderr are joined to just stdout. This means errors can't be handled properly and it corrupts the results in stdout.

💪 Contributing

Want to contribute? Awesome! We welcome contributions of all kinds: new scanners, fixes to the existing implementations, bug reports, and feedback of any kind.

See the contributing guide here.
Guidelines on how to onboard a new scanner can be found in the Scanner Onboarding Guide
Check out the Development Guide for more information.
By contributing you agree to abide by the Code of Conduct.

License

Apache Version 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 218 Commits
.github/workflows		.github/workflows
data		data
docs		docs
kalm_benchmark		kalm_benchmark
manifests		manifests
notebooks		notebooks
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
checkov-job.yaml		checkov-job.yaml
cluster.yaml		cluster.yaml
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
skaffold.yaml		skaffold.yaml
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Kalm Benchmark

Description

Benchmark Manifests

Web UI

CCSS Integration

Use Cases

For Security and DevOps Teams:

For Scanner Developers & Vendors:

Prerequisites

Getting Started

1) 🔨 Installation

2) 🏄‍♀️ Usage

2.1) Generating manifests

2.2) Start the Web UI

2.3) Perform a scan with a Scanner

2.4) Evaluate a Scanner

2.5) Database Management

🚀 Deployment

Scanner Requirements Summary

Scanner Features & Severity Support

Tool-specific considerations

Troubleshooting

Scanner Issues

Docker-based Issues

💪 Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📊 Kalm Benchmark

Description

Benchmark Manifests

Web UI

CCSS Integration

Use Cases

For Security and DevOps Teams:

For Scanner Developers & Vendors:

Prerequisites

Getting Started

1) 🔨 Installation

2) 🏄‍♀️ Usage

2.1) Generating manifests

2.2) Start the Web UI

2.3) Perform a scan with a Scanner

2.4) Evaluate a Scanner

2.5) Database Management

🚀 Deployment

Scanner Requirements Summary

Scanner Features & Severity Support

Tool-specific considerations

Troubleshooting

Scanner Issues

Docker-based Issues

💪 Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages