Skip to content

What is krkn formatting clarity improvement #76

@tejugang

Description

@tejugang

Currently:
"Leveraging Cerberus to monitor the cluster under test and consuming the aggregated go/no-go signal to determine pass/fail post chaos. It is highly recommended to turn on the Cerberus health check feature available in Kraken. Instructions on installing and setting up Cerberus can be found here or can be installed from Kraken using the instructions. Once Cerberus is up and running, set cerberus_enabled to True and cerberus_url to the url where Cerberus publishes go/no-go signal in the Kraken config file. Cerberus can monitor application routes during the chaos and fails the run if it encounters downtime as it is a potential downtime in a customers, or users environment as well. It is especially important during the control plane chaos scenarios including the API server, Etcd, Ingress etc. It can be enabled by setting check_applicaton_routes: True in the Kraken config provided application routes are being monitored in the cerberus config.”

Instead:
Leveraging Cerberus to monitor the cluster under test and consuming the aggregated go/no-go signal to determine pass/fail post chaos.

  • It is highly recommended to turn on the Cerberus health check feature available in Kraken. Instructions on installing and setting up Cerberus can be found here or can be installed from Kraken using the instructions.
  • Once Cerberus is up and running, set cerberus_enabled to True and cerberus_url to the url where Cerberus publishes go/no-go signal in the Kraken config file.
  • Cerberus can monitor application routes during the chaos and fails the run if it encounters downtime as it is a potential downtime in a customer’s or user’s environment.
    - It is especially important during the control plane chaos scenarios including the API server, Etcd, Ingress etc.
    - It can be enabled by setting check_applicaton_routes: True in the Kraken config provided application routes are being monitored in the cerberus config.
  • Leveraging built-in alert collection feature to fail the runs in case of critical alerts.
    - See also: SLOs validation for more details on metrics and alerts
    Fail test if certain metrics aren’t met at the end of the run

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions