Skip to content

dynatrace-oss/Kalm-Benchmark

πŸ“Š Kalm Benchmark

KALM provides a comprehensive, standardized benchmark for evaluating and comparing Kubernetes security scanners. The benchmark consists of two components:

  • 235+ intentionally vulnerable Kubernetes manifests covering 12 major security categories that scanners should detect
  • Interactive web UI for analyzing scanner performance, accuracy, and coverage with CCSS alignment scoring
⚠️ This product is not officially supported by Dynatrace.

Description

Benchmark Manifests

KALM provides a comprehensive test suite of 235+ Kubernetes manifests specifically designed to evaluate security scanner effectiveness. Each manifest represents a specific security misconfiguration or vulnerability pattern that scanners should detect.

Key characteristics of the benchmark manifests:

  • Intentionally vulnerable: Each manifest contains a specific security issue (privileged containers, exposed secrets, RBAC misconfigurations, etc.)
  • Single-issue focus: One manifest tests one security check to enable precise scanner comparison
  • Comprehensive coverage: Tests span 12 major security categories:
    • Pod Security: Privilege escalation, host access, security contexts
    • RBAC: Excessive permissions, cluster-admin usage, service account issues
    • Network Policies: Traffic isolation, metadata API access
    • Resource Management: CPU/memory limits, resource quotas
    • Container Security: Image policies, capabilities, read-only filesystems
    • Secrets & ConfigMaps: Sensitive data exposure
    • Namespaces: Default namespace usage, system namespace access
    • Pod Security Standards: PSA configuration issues
    • Supply Chain: Image tags, registry security
    • Workload Types: Naked pods, reliability configurations
    • Network Security: Ingress configurations, TLS settings
    • Infrastructure: Storage, reliability, node selection

Structured for evaluation: Each manifest includes metadata annotations specifying:

  • Expected scanner result (alert or pass)
  • Check description and security impact
  • Specific configuration paths that should be flagged
  • Unique check IDs for result correlation

This design enables precise measurement of scanner accuracy, false positive rates, and coverage across different security domains.

πŸ“‹ Complete catalog: Benchmark Checks (235+ individual security tests)

Web UI

The web application provides multiple analysis views:

  • an overview of various scanners checked with this benchmark
    overview
  • an analysis page to inspect the results of a specific scanner in more detail
    analysis
  • a CCSS alignment page to compare scanner performance against standardized scoring

Key Features:

  • Modular architecture: Dedicated analysis modules for helm scanning, benchmark comparison, and CCSS alignment
  • Helm analytics: Per-chart deployment analysis, security profiles, and interactive filtering
  • Interactive visualizations: Charts, pivot tables, and grouped security comparisons
  • Unified database: SQLite backend with automatic result persistence
  • Real-time monitoring: Live scan progress updates and centralized logging
  • Settings management: Configurable data directories and display options

CCSS Integration

The benchmark now includes CCSS (Common Configuration Scoring System) integration for scanner analysis:

  • Scanner Alignment Analysis: Compare how different scanners align with standardized CCSS scores
  • Multi-Source Support: Evaluate scanners against Kubernetes manifests, live API servers, and Helm charts
  • Research Capabilities: Designed to support large-scale evaluation (e.g., top 100+ Helm charts from Artifactory)
  • Flexible Configuration: Supports any number of charts, mixed source types, and custom evaluation criteria
  • Data Models: Extended data structures for comprehensive misconfiguration analysis

Key features:

  • Interactive alignment visualizations and scanner rankings
  • Category-specific performance analysis
  • Statistical correlation between native scanner scores and CCSS scores
  • Database persistence for evaluation runs and findings
  • Backward compatibility with existing KALM functionality

Use Cases

For Security and DevOps Teams:

  • Scanner Evaluation: Compare 12+ security scanners across 235+ real-world vulnerability patterns
  • Tool Selection: Identify scanners with best coverage for your specific security requirements
  • Custom Rule Development: Use benchmark results to develop and validate custom security policies
  • Compliance Assessment: Evaluate scanner alignment with industry standards (CCSS scoring)
  • Performance Benchmarking: Measure scanner accuracy, false positive rates, and detection coverage

For Scanner Developers & Vendors:

  • Competitive Analysis: Compare your tool against market alternatives using standardized tests
  • Quality Assurance: Identify detection gaps, false positives, and rule conflicts across security categories
  • Product Development: Use benchmark feedback to improve check accuracy and coverage
  • Standards Alignment: Optimize scanner output to align with CCSS and industry scoring standards
  • Regression Testing: Validate that updates don't break existing detection capabilities

Prerequisites

  • Python >= 3.10
  • The manifests are generated using cdk8s, which depends on nodeJS
    • Please ensure nodeJS is installed on your system
  • Any scanner for which a scan should be triggered must be installed manually
  • Poetry is used to manage the project itself

Getting Started

1) πŸ”¨ Installation

To install the benchmark along with its dependencies listed in the pyproject.toml execute:

poetry install

2) πŸ„β€β™€οΈ Usage

To use this package run the CLI with the appropriate command:

poetry run cli <command>

For detailed information of the available commands or general help run:

poetry run cli --help

2.1) Generating manifests

To generate manifests use the generate command:

poetry run cli generate [--out <target directory>]

These manifests form the basis for the benchmark and will be placed in the directory specified with the --out argument. The location defaults to the manifests folder in the working directory.

2.2) Start the Web UI

Besides the CLI commands the tool also provides a web user interface to manage the scan(s) and analyse evaluation results. It can be started with the command:

poetry run cli serve

The web UI includes:

  • Settings Panel: Configure data directory and display options
  • Automatic Result Saving: Scan results are automatically saved to unified database
  • Centralized Logging: View scan logs and UI activity in organized log files
  • Real-time Updates: Monitor scan progress with live status updates
  • Database Backend: SQLite-based data storage

2.3) Perform a scan with a Scanner

To scan either a cluster or manifest files with the specified tool use the scan command. Use either the -c flag to specify the target cluster or the -f flag to specify the target file/folder.

poetry run cli scan <tool> [-c | -f <target file or folder>]

❗️ Important executing a scan requires the respective tool to be installed on the system!

πŸ“‹ Supported Scanners: Kubescape, KubeLinter, KICS, Trivy, Checkov, Polaris, Terrascan, Kube-score, Snyk, Kubesec, Kube-bench, KubiScan

πŸ”§ Quick Setup: For detailed installation instructions for all scanners, see the Scanner Installation Guide

E.g., to scan manifests with the tool dummy located in the manifests folder execute:

poetry run cli scan dummy -f manifests

In order to save the results, add the -o flag with the name of the destination folder:

poetry run cli scan <tool> [-c | -f <target file or folder>] -o <output-folder>

2.4) Evaluate a Scanner

To evaluate a scanner, first run a scan to generate results in the database, then use the evaluate command:

poetry run cli evaluate <tool>

You can also evaluate a specific scan run:

poetry run cli evaluate <tool> --run-id <scan_run_id>

2.5) Database Management

The benchmark uses a unified SQLite database for CCSS integration:

View database statistics:

poetry run cli db-stats

πŸš€ Deployment

Some scanners only scan resources deployed in a Kubernetes cluster. You can find instructions on how to deploy the benchmark in a cluster here

Scanner Requirements Summary

Scanner Type Requirements Examples
Manifest-based Scanner binary + YAML files Kubescape, KICS, Trivy, Polaris
Cluster-based Scanner binary + Running K8s cluster Kube-bench, KubiScan
API Key required Scanner binary + External service token Snyk, Checkov (for full features)

πŸ“– Detailed setup instructions: Scanner Installation Guide

Scanner Features & Severity Support

KALM supports 12 security scanners and provides comprehensive severity information extraction from all of them. Each scanner has different severity formats and coverage levels.

πŸ“Š Complete severity support matrix: Scanner Installation Guide - Scanner Coverage

Tool-specific considerations

Some scanners have special requirements or focus areas:

  • kube-bench: Focuses on infrastructure security (CIS Kubernetes Benchmark) rather than workload security
  • KubiScan: Requires special setup as it's distributed as a Python script
  • Snyk/Checkov: Require API keys for full functionality

πŸ“– Complete scanner details and setup instructions: Scanner Installation Guide

Troubleshooting

Scanner Issues

For comprehensive troubleshooting of scanner installation, configuration, and execution issues, see the Scanner Installation Guide - Troubleshooting Section.

Docker-based Issues

  • ensure the -t flag is not used in the command. If it is, stdout and stderr are joined to just stdout. This means errors can't be handled properly and it corrupts the results in stdout.

πŸ’ͺ Contributing

Want to contribute? Awesome! We welcome contributions of all kinds: new scanners, fixes to the existing implementations, bug reports, and feedback of any kind.


License

Apache Version 2.0

About

A comprehensive benchmark and analytics platform for evaluating Kubernetes security scanners. Features 235+ intentionally vulnerable manifests across 12 security categories, CCSS integration, Helm chart analysis, and interactive web dashboards for scanner comparison, compliance assessment, and large-scale security research

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors