Skip to content

Refactor CI-Lib Implementation to Support Litmus 3.0 #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 41 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
a1b3a17
chore(dependencies): Update Go version and various dependencies in go…
SkySingh04 Apr 16, 2025
34fbaaf
chore(dependencies): Add indirect dependencies for Litmus SDK and upd…
SkySingh04 Apr 16, 2025
684d341
chore(experiments): Enhance pod-delete experiment with SDK infrastruc…
SkySingh04 Apr 16, 2025
de5dcf3
chore(experiments): Update pod-delete experiment to utilize Litmus SD…
SkySingh04 Apr 16, 2025
ef11f9d
feat(experiments): Refactor pod-delete experiment request constructio…
SkySingh04 Apr 18, 2025
8dc090a
feat(experiments): Enhance chaos experiments with Litmus SDK integrat…
SkySingh04 Apr 18, 2025
edcd62a
feat(infrastructure): Refactor chaos experiment infrastructure manage…
SkySingh04 Apr 27, 2025
dde1453
feat(experiments): Refactor chaos experiment tests to utilize new inf…
SkySingh04 Apr 27, 2025
f424090
refactor(experiments): Update pod-cpu-hog and pod-delete tests to use…
SkySingh04 Apr 27, 2025
5f7c13f
feat(experiments): Update experiment timeout and polling interval con…
SkySingh04 Apr 27, 2025
1e8e637
feat(experiments): Refactor pod-delete experiment request to use Argo…
SkySingh04 Apr 28, 2025
2626386
feat(experiments): Add NODES_AFFECTED_PERC environment variable to no…
SkySingh04 Apr 28, 2025
d97c3b2
feat(experiments): Refactor experiment phase checks to use utility fu…
SkySingh04 Apr 28, 2025
8ba45c3
chore(dependencies): Update go.mod and go.sum to include new dependen…
SkySingh04 May 4, 2025
a3284a6
refactor(experiments): Update pod-delete experiment test and clientse…
SkySingh04 May 4, 2025
3e669c3
refactor(experiments): Enhance pod-delete experiment request construc…
SkySingh04 May 4, 2025
2879407
chore(dependencies): Update go.mod and go.sum for litmus-go-sdk
SkySingh04 May 6, 2025
97b8356
refactor(experiments): Improve pod-delete experiment request handling
SkySingh04 May 6, 2025
e2e8edb
refactor(experiments): Update pod-delete experiment test for improved…
SkySingh04 May 6, 2025
23d8d8a
refactor(experiments): Enhance pod-delete experiment request construc…
SkySingh04 May 6, 2025
000f942
refactor(experiments): Modularize pod-delete experiment request const…
SkySingh04 May 6, 2025
69f8b94
refactor(experiments): Expand experiment configuration and manifest g…
SkySingh04 May 7, 2025
37615f3
refactor(experiments): Update pod CPU and memory hog tests for improv…
SkySingh04 May 7, 2025
b3896a8
refactor(experiments): Enhance experiment configuration with probe su…
SkySingh04 May 7, 2025
29b7ac8
refactor(experiments): Enhance network chaos experiments with modular…
SkySingh04 May 7, 2025
0fd5fd0
refactor(experiments): Modularize chaos experiment request constructi…
SkySingh04 May 7, 2025
d973922
refactor(experiments): Implement polling for experiment run availabil…
SkySingh04 May 7, 2025
f7ffc59
refactor(experiments): Replace ioutil with io for reading data in cha…
SkySingh04 May 7, 2025
9b4832d
refactor(experiments): Add probe configuration support to pod-delete …
SkySingh04 May 10, 2025
b5ca79d
refactor(experiments): Integrate probe setup and cleanup in chaos exp…
SkySingh04 May 10, 2025
98a3239
refactor(experiments): Remove NODE_LABEL from chaos experiment enviro…
SkySingh04 May 10, 2025
699090c
refactor(experiments): Update chaos experiment tests and request cons…
SkySingh04 May 10, 2025
c966ab1
refactor(workflow): Enhance NODE_LABEL handling in experiment manifests
SkySingh04 May 10, 2025
3201a4b
refactor(experiments): Integrate Litmus SDK client in chaos experimen…
SkySingh04 May 11, 2025
03e7186
refactor(experiments): Integrate Litmus SDK client across chaos exper…
SkySingh04 May 11, 2025
2223642
refactor(workflow): Improve NODE_LABEL handling in experiment manifests
SkySingh04 May 11, 2025
9207e7a
feat(environment): Add environment variable support for chaos CI setup
SkySingh04 May 12, 2025
95a0dcb
refactor(workflow): Update default memory consumption values and enha…
SkySingh04 May 13, 2025
da1ce29
chore(dependencies): Update litmus-go-sdk version in go.mod and go.sum
SkySingh04 May 13, 2025
2c4108a
refactor(experiments): Remove HTTP fetching and YAML parsing from cha…
SkySingh04 May 18, 2025
1eee72d
refactor(experiments): Remove manual setting of ConnectedInfraID in c…
SkySingh04 May 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,101 @@ Litmus supports CI plugin for the following CI platforms:
</tr>
</table>

## Environment Variables

Chaos CI Lib uses standardized environment variables to configure environments, infrastructure, and probes. Below is the comprehensive list of supported environment variables.

### Litmus SDK & Authentication

| Variable | Description | Default | Example |
|----------|-------------|---------|---------|
| `LITMUS_ENDPOINT` | Litmus server endpoint URL | `""` | `https://chaos.example.com` |
| `LITMUS_USERNAME` | Username for Litmus authentication | `""` | `admin` |
| `LITMUS_PASSWORD` | Password for Litmus authentication | `""` | `litmus` |
| `LITMUS_PROJECT_ID` | ID of the Litmus project to use | `""` | `project-123` |

### Environment Management Variables

| Variable | Description | Default | Example |
|----------|-------------|---------|---------|
| `CREATE_ENV` | Whether to create a new environment | `true` | `false` |
| `USE_EXISTING_ENV` | Whether to use an existing environment | `false` | `true` |
| `EXISTING_ENV_ID` | ID of existing environment (required if `USE_EXISTING_ENV=true`) | `""` | `env-123456` |
| `ENV_NAME` | Name for the new environment | `chaos-ci-env` | `my-k8s-env` |
| `ENV_TYPE` | Type of environment to create | `NON_PROD` | `PROD` |
| `ENV_DESCRIPTION` | Description of the environment | `CI Test Environment` | `Production Test Environment` |

### Infrastructure Management Variables

| Variable | Description | Default | Example |
|----------|-------------|---------|---------|
| `INSTALL_INFRA` | Whether to install infrastructure | `true` | `false` |
| `USE_EXISTING_INFRA` | Whether to use existing infrastructure | `false` | `true` |
| `EXISTING_INFRA_ID` | ID of existing infrastructure (required if `USE_EXISTING_INFRA=true`) | `""` | `infra-123456` |
| `INFRA_NAME` | Name for the infrastructure | `ci-infra-{expName}` | `my-k8s-infra` |
| `INFRA_NAMESPACE` | Kubernetes namespace for infrastructure | `litmus` | `chaos-testing` |
| `INFRA_SCOPE` | Scope of infrastructure | `namespace` | `cluster` |
| `INFRA_SERVICE_ACCOUNT` | Service account for infrastructure | `litmus` | `chaos-runner` |
| `INFRA_DESCRIPTION` | Description of infrastructure | `CI Test Infrastructure` | `Production Test Infra` |
| `INFRA_PLATFORM_NAME` | Platform name | `others` | `gcp` |
| `INFRA_NS_EXISTS` | Whether namespace already exists | `false` | `true` |
| `INFRA_SA_EXISTS` | Whether service account already exists | `false` | `true` |
| `INFRA_SKIP_SSL` | Whether to skip SSL verification | `false` | `true` |
| `INFRA_NODE_SELECTOR` | Node selector for infrastructure | `""` | `disk=ssd` |
| `INFRA_TOLERATIONS` | Tolerations for infrastructure | `""` | `key=value:NoSchedule` |

### Probe Management Variables

| Variable | Description | Default | Example |
|----------|-------------|---------|---------|
| `LITMUS_CREATE_PROBE` | Whether to create a probe | `false` | `true` |
| `LITMUS_PROBE_NAME` | Name of the probe | `http-probe` | `http-status-check` |
| `LITMUS_PROBE_TYPE` | Type of probe | `httpProbe` | `httpProbe` |
| `LITMUS_PROBE_MODE` | Mode of the probe | `SOT` | `Continuous` |
| `LITMUS_PROBE_URL` | URL for HTTP probe | `http://localhost:8080/health` | `http://app:8080/health` |
| `LITMUS_PROBE_TIMEOUT` | Timeout for probe | `30s` | `5s` |
| `LITMUS_PROBE_INTERVAL` | Interval for probe | `10s` | `5s` |
| `LITMUS_PROBE_ATTEMPTS` | Number of attempts for probe | `1` | `3` |
| `LITMUS_PROBE_RESPONSE_CODE` | Expected HTTP response code | `200` | `200` |

### Example Usage

To create a new environment and infrastructure:
```bash
# Authentication
export LITMUS_ENDPOINT="https://chaos.example.com"
export LITMUS_USERNAME="admin"
export LITMUS_PASSWORD="litmus"
export LITMUS_PROJECT_ID="project-123"

# Environment setup
export CREATE_ENV="true"
export ENV_NAME="test-environment"
export ENV_TYPE="NON_PROD"

# Infrastructure setup
export INSTALL_INFRA="true"
export INFRA_NAME="test-infra"
export INFRA_NAMESPACE="chaos-testing"
export INFRA_SCOPE="namespace"

# Optional probe setup
export LITMUS_CREATE_PROBE="true"
export LITMUS_PROBE_NAME="http-status-check"
export LITMUS_PROBE_TYPE="httpProbe"
export LITMUS_PROBE_URL="http://app:8080/health"
export LITMUS_PROBE_RESPONSE_CODE="200"
```

To use existing environment and infrastructure:
```bash
# Set environment variables for existing resources
export USE_EXISTING_ENV="true"
export EXISTING_ENV_ID="env-123456"
export USE_EXISTING_INFRA="true"
export EXISTING_INFRA_ID="infra-789012"
```

## How to get started?

Refer the [LitmusChaos Docs](https://docs.litmuschaos.io) and [Experiment Docs](https://litmuschaos.github.io/litmus/experiments/categories/contents/)
Expand Down
190 changes: 138 additions & 52 deletions experiments/container-kill_test.go
Original file line number Diff line number Diff line change
@@ -1,17 +1,21 @@
package experiments

import (
"fmt"
"testing"
"time"

"github.com/litmuschaos/chaos-ci-lib/pkg"
"github.com/litmuschaos/chaos-ci-lib/pkg/environment"
engine "github.com/litmuschaos/chaos-ci-lib/pkg/generic/container-kill/lib"
"github.com/litmuschaos/chaos-ci-lib/pkg/log"
"github.com/litmuschaos/chaos-ci-lib/pkg/infrastructure"
"github.com/litmuschaos/chaos-ci-lib/pkg/types"
"github.com/litmuschaos/chaos-operator/pkg/apis/litmuschaos/v1alpha1"
"github.com/litmuschaos/chaos-ci-lib/pkg/workflow"
"github.com/litmuschaos/litmus-go-sdk/pkg/sdk"
models "github.com/litmuschaos/litmus/chaoscenter/graphql/server/graph/model"
. "github.com/onsi/ginkgo"
. "github.com/onsi/gomega"
_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
"k8s.io/klog"
)

func TestContainerKill(t *testing.T) {
Expand All @@ -22,62 +26,144 @@ func TestContainerKill(t *testing.T) {
//BDD for running container-kill experiment
var _ = Describe("BDD of running container-kill experiment", func() {

Context("Check for container-kill experiment", func() {
Context("Check for container-kill experiment via SDK", func() {
// Define variables accessible to It and AfterEach
var (
experimentsDetails types.ExperimentDetails
sdkClient sdk.Client
err error
)

It("Should check for the pod delete experiment", func() {

experimentsDetails := types.ExperimentDetails{}
clients := environment.ClientSets{}
chaosEngine := v1alpha1.ChaosEngine{}

//Getting kubeConfig and Generate ClientSets
By("[PreChaos]: Getting kubeconfig and generate clientset")
err := clients.GenerateClientSetFromKubeConfig()
Expect(err).To(BeNil(), "Unable to Get the kubeconfig, due to {%v}", err)
BeforeEach(func() {
experimentsDetails = types.ExperimentDetails{}
err = nil

//Fetching all the default ENV
By("[PreChaos]: Fetching all default ENVs")
log.Infof("[PreReq]: Getting the ENVs for the %v experiment", experimentsDetails.ExperimentName)
klog.Infof("[PreReq]: Getting the ENVs for the %v experiment", experimentsDetails.ExperimentName)
environment.GetENV(&experimentsDetails, "container-kill", "container-kill-engine")

// Install RBAC for experiment Execution
By("[Prepare]: Prepare and install RBAC")
err = pkg.InstallRbac(&experimentsDetails, experimentsDetails.ChaosNamespace)
Expect(err).To(BeNil(), "fail to install rbac for the experiment, due to {%v}", err)

// Install ChaosEngine for experiment Execution
By("[Prepare]: Prepare and install ChaosEngine")
err = engine.InstallContainerKillEngine(&experimentsDetails, &chaosEngine, clients)
Expect(err).To(BeNil(), "fail to install chaosengine, due to {%v}", err)

//Checking runner pod running state
By("[Status]: Runner pod running status check")
err = pkg.RunnerPodStatus(&experimentsDetails, chaosEngine.Namespace, clients)
if err != nil && chaosEngine.Namespace != experimentsDetails.AppNS {
err = pkg.RunnerPodStatus(&experimentsDetails, experimentsDetails.AppNS, clients)
// Initialize SDK client
By("[PreChaos]: Initializing SDK client")
sdkClient, err = environment.GenerateClientSetFromSDK()
Expect(err).To(BeNil(), "Unable to generate Litmus SDK client, due to {%v}", err)

// Setup infrastructure
By("[PreChaos]: Setting up infrastructure")
err = infrastructure.SetupInfrastructure(&experimentsDetails, sdkClient)
Expect(err).To(BeNil(), "Failed to setup infrastructure, due to {%v}", err)

// Validate that infrastructure ID is properly set
Expect(experimentsDetails.ConnectedInfraID).NotTo(BeEmpty(), "Setup failed: ConnectedInfraID is empty after connection attempt.")

// Setup probe if configured to do so
if experimentsDetails.CreateProbe {
By("[PreChaos]: Setting up probe")
err = workflow.CreateProbe(&experimentsDetails, sdkClient, experimentsDetails.LitmusProjectID)
Expect(err).To(BeNil(), "Failed to create probe, due to {%v}", err)
// Validate that probe was created successfully
Expect(experimentsDetails.CreatedProbeID).NotTo(BeEmpty(), "Probe creation failed: CreatedProbeID is empty")
}
Expect(err).To(BeNil(), "Runner pod status check failed, due to {%v}", err)

//Chaos pod running status check
err = pkg.ChaosPodStatus(&experimentsDetails, clients)
Expect(err).To(BeNil(), "Chaos pod status check failed, due to {%v}", err)

//Waiting for chaos pod to get completed
//And Print the logs of the chaos pod
By("[Status]: Wait for chaos pod completion and then print logs")
err = pkg.ChaosPodLogs(&experimentsDetails, clients)
Expect(err).To(BeNil(), "Fail to get the experiment chaos pod logs, due to {%v}", err)

//Checking the chaosresult verdict
By("[Verdict]: Checking the chaosresult verdict")
err = pkg.ChaosResultVerdict(&experimentsDetails, clients)
Expect(err).To(BeNil(), "ChasoResult Verdict check failed, due to {%v}", err)

//Checking chaosengine verdict
By("Checking the Verdict of Chaos Engine")
err = pkg.ChaosEngineVerdict(&experimentsDetails, clients)
Expect(err).To(BeNil(), "ChaosEngine Verdict check failed, due to {%v}", err)
})

It("Should run the container kill experiment via SDK", func() {

// Ensure pre-checks passed from BeforeEach
Expect(err).To(BeNil(), "Error during BeforeEach setup: %v", err)
klog.Info("Executing V3 SDK Path for Experiment")


// 1. Construct Experiment Request
By("[SDK Prepare]: Constructing Chaos Experiment Request")
experimentName := pkg.GenerateUniqueExperimentName("container-kill")
experimentsDetails.ExperimentName = experimentName
experimentID := pkg.GenerateExperimentID()
experimentRequest, errConstruct := workflow.ConstructContainerKillExperimentRequest(&experimentsDetails, experimentID, experimentName)
Expect(errConstruct).To(BeNil(), "Failed to construct experiment request: %v", errConstruct)

// 2. Create and Run Experiment via SDK
By("[SDK Prepare]: Creating and Running Chaos Experiment")
createResponse, err := sdkClient.Experiments().Create(experimentsDetails.LitmusProjectID, *experimentRequest)
Expect(err).To(BeNil(), "Failed to create experiment via SDK: %v", err)
klog.Infof("Created experiment: %s", createResponse)

// 3. Get the experiment run ID
By("[SDK Query]: Polling for experiment run to become available")
var experimentRunID string
maxRetries := 10
found := false

for i := 0; i < maxRetries; i++ {
time.Sleep(3 * time.Second)

listExperimentRunsReq := models.ListExperimentRunRequest{
ExperimentIDs: []*string{&experimentID},
}

runsList, err := sdkClient.Experiments().ListRuns(listExperimentRunsReq)
if err != nil {
klog.Warningf("Error fetching experiment runs: %v", err)
continue
}

klog.Infof("Attempt %d: Found %d experiment runs", i+1,
len(runsList.ExperimentRuns))

if len(runsList.ExperimentRuns) > 0 {
experimentRunID = runsList.ExperimentRuns[0].ExperimentRunID
klog.Infof("Found experiment run ID: %s", experimentRunID)
found = true
break
}

klog.Infof("Retrying after delay...")
}

Expect(found).To(BeTrue(), "No experiment runs found for experiment after %d retries", maxRetries)

// 4. Poll for Experiment Run Status
By("[SDK Status]: Polling for Experiment Run Status")
var finalPhase string
var pollError error
timeout := time.After(time.Duration(experimentsDetails.ExperimentTimeout) * time.Minute)
ticker := time.NewTicker(time.Duration(experimentsDetails.ExperimentPollingInterval) * time.Second)
defer ticker.Stop()

pollLoop:
for {
select {
case <-timeout:
pollError = fmt.Errorf("timed out waiting for experiment run %s to complete after %d minutes", experimentRunID, experimentsDetails.ExperimentTimeout)
klog.Error(pollError)
break pollLoop
case <-ticker.C:
phase, errStatus := sdkClient.Experiments().GetRunPhase(experimentRunID)
if errStatus != nil {
klog.Errorf("Error fetching experiment run status for %s: %v", experimentRunID, errStatus)
continue
}
klog.Infof("Experiment Run %s current phase: %s", experimentRunID, phase)
finalPhases := []string{"Completed", "Completed_With_Error", "Failed", "Error", "Stopped", "Skipped", "Aborted", "Timeout", "Terminated"}
if pkg.ContainsString(finalPhases, phase) {
finalPhase = phase
klog.Infof("Experiment Run %s reached final phase: %s", experimentRunID, phase)
break pollLoop
}
}
}

// 5. Post Validation / Verdict Check
By("[SDK Verdict]: Checking Experiment Run Verdict")
Expect(pollError).To(BeNil())
Expect(finalPhase).NotTo(BeEmpty(), "Final phase should not be empty after polling")
Expect(finalPhase).To(Equal("Completed"), fmt.Sprintf("Experiment Run phase should be Completed, but got %s", finalPhase))
})
// Cleanup using AfterEach
AfterEach(func() {
// Disconnect infrastructure using the new module
By("[CleanUp]: Cleaning up infrastructure")
errDisconnect := infrastructure.DisconnectInfrastructure(&experimentsDetails, sdkClient)
Expect(errDisconnect).To(BeNil(), "Failed to clean up infrastructure, due to {%v}", errDisconnect)
})
})
})
})
Loading