Skip to content

Chaos mesh is unable to start #98

Open
@QYuQianchen

Description

@QYuQianchen

Description

When running a network bandwidth experiment on an existing network, chaos-mesh fails to start. The returned error is "chaos-mesh is still in a 'starting' state after 10 seconds. "
Not that the same setup (running an existing kurtosis testnet) for memory-stress test works.

Logs

All the pods in namespaces (chaos-mesh, kt-relaytestnet, and kurtosis-engine) running as expected in Kubernetes

NAMESPACE                                          NAME                                                    READY   STATUS    RESTARTS         AGE
chaos-mesh                                         chaos-controller-manager-c47ffc85d-2fcvh                1/1     Running   1 (14m ago)      51m
chaos-mesh                                         chaos-controller-manager-c47ffc85d-s27pd                1/1     Running   0                51m
chaos-mesh                                         chaos-controller-manager-c47ffc85d-vdkfr                1/1     Running   1 (8m47s ago)    51m
chaos-mesh                                         chaos-daemon-lr9nh                                      2/2     Running   0                50m
chaos-mesh                                         chaos-dashboard-766759d4f4-s92sn                        1/1     Running   0                51m
chaos-mesh                                         chaos-dns-server-7c4f7b67c8-n6lfs                       1/1     Running   0                51m

Log printed in console:

Launch attack scenario metaclear-network-bandwidth...
INFO[0000] Loading test suite from /Users/qyu/attacknet/test-suites/metaclear-network-bandwidth.yaml 
INFO[0000] Loading kurtosis network configuration from /Users/qyu/attacknet/network-configs/metaclear-relayer-devnet.yaml 
INFO[0001] Looking for existing enclave identified by namespace kt-relaytestnet 
INFO[0030] An active enclave matching kt-relaytestnet was found 
INFO[0030] Creating a chaos-mesh client                 
INFO[0030] Waiting 50 seconds before starting fault injection 
INFO[0080] Running 1 tests                              
INFO[0080] Running test (1/1): 'metaclear-network-bandwidth' 
INFO[0080] Running test step (1/2): 'inject fault to beacon client' 
ERRO[0090] Error while running test #1                  
FATA[0090] chaos-mesh is still in a 'starting' state after 10 seconds. Check kubernetes events to see what's wrong.
 --- at /Users/qyu/attacknet/pkg/test_executor/executor.go:152 (waitForInjectionCompleted) --- 

Config

YAML config file for attacknet

attacknetConfig:
  grafanaPodName: grafana
  grafanaPodPort: 3000
  waitBeforeInjectionSeconds: 50
  reuseDevnetBetweenRuns: true
  allowPostFaultInspection: true
  existingDevnetNamespace: kt-relaytestnet

harnessConfig:
  networkPackage: github.com/kurtosis-tech/ethereum-package
  networkConfig: metaclear-relayer-devnet.yaml
  networkType: ethereum

testConfig:
  tests:
  - testName: metaclear-network-bandwidth
    health:
      enableChecks: true
      gracePeriod: 2m0s
    planSteps:
    - stepType: injectFault
      description: "inject fault to beacon client"
      chaosFaultSpec:
        kind: NetworkChaos
        apiVersion: chaos-mesh.org/v1alpha1
        spec:
          selector:
            namespaces:
              - kt-relaytestnet
            labelSelectors:
              kurtosistech.com/id: cl-3-lighthouse-geth
          target:
            mode: all
            namespaces:
              - kt-relaytestnet
            labelSelectors:
              kurtosistech.com/id: cl-2-lighthouse-geth
          mode: all
          action: bandwidth
          duration: 1m
          direction: to
          bandwidth:
            rate: '10kbps'
            limit:  20000
            buffer: 500
    - stepType: waitForFaultCompletion
      description: wait for faults to terminate

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions