Skip to content

Refactor CI-Lib Implementation to Support Litmus 3.0 #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 90 commits into from
Jun 3, 2025

Conversation

SkySingh04
Copy link
Contributor

@SkySingh04 SkySingh04 commented Apr 13, 2025

This pull request refactors the Chaos CI Library to work with the Litmus 3.0 approach, shifting from direct Kubernetes API calls to utilizing the Litmus Go SDK.

Changes

Code Modernization

  • Updated Go version from 1.14 to 1.24.0
  • Upgraded and cleaned up dependencies including k8s.io/api, k8s.io/apimachinery, and ginkgo
  • Added github.com/litmuschaos/litmus-go-sdk as a core dependency

Architecture Changes

  • Replaced direct Kubernetes and Litmus client initialization with the Litmus SDK client
  • Updated ClientSet generation to work with the SDK using environment variables
  • Added proper error handling for missing configuration and initialization failures
  • Added NetworkPacketCorruptionPercentage field to the ExperimentDetails struct

Test Suite Refactoring

  • Completely refactored all experiment test files to use SDK-based approach:
    • node-io-stress_test.go
    • node-memory-hog_test.go
    • pod-autoscaler_test.go
    • pod-network-corruption_test.go
    • pod-network-duplication_test.go
    • pod-network-loss_test.go
    • pod-memory-hog_test.go
    • pod-network-latency_test.go
  • Updated experiment template retrieval to use external hub sources
  • Implemented consistent structure across all test files with connection, execution, and cleanup phases

Functionality Improvements

  • Increased default timeout from 180 to 300 seconds for better reliability
  • Refactored status checking functions to use the SDK's GetRunPhase method
  • Improved experiment state monitoring with enhanced logging
  • Introduced helper functions for retrieving experiment and application statuses
  • Updated all status-checking functions to align with the new status retrieval methods

The refactoring maintains the same functionality while leveraging the more robust and maintainable SDK approach, reducing direct dependency on the Kubernetes API.

@SkySingh04 SkySingh04 changed the title feat : Refactor CI-Lib implementation to 3.0 approach WIP : Refactor CI-Lib Implementation to Support Litmus 3.0 Apr 13, 2025
@SkySingh04 SkySingh04 force-pushed the refactor_to_3.0 branch 3 times, most recently from dcff924 to 0a398d4 Compare April 16, 2025 21:57
@SkySingh04 SkySingh04 changed the title WIP : Refactor CI-Lib Implementation to Support Litmus 3.0 Refactor CI-Lib Implementation to Support Litmus 3.0 Apr 18, 2025
@ispeakc0de ispeakc0de requested a review from uditgaurav May 19, 2025 10:24
….mod and go.sum

- Bump Go version to 1.24.0 and toolchain to 1.24.1
- Update dependencies for litmuschaos components and testing libraries
- Add new indirect dependencies for improved functionality and compatibility
- Remove outdated versions and clean up go.sum

Signed-off-by: [Your Name] [Your Email]
Signed-off-by: Sky Singh <[email protected]>
…ate go.mod/go.sum

- Introduce github.com/imdario/mergo v0.3.9 and github.com/spf13/pflag v1.0.5 as indirect dependencies
- Enhance ClientSets structure by adding SDKClient for Litmus SDK integration
- Implement GenerateClientSetFromSDK method to initialize Litmus SDK client with environment variables

Signed-off-by: [Your Name] [Your Email]
Signed-off-by: Sky Singh <[email protected]>
…ture connection

- Add support for connecting to ChaosCenter infrastructure via SDK in the pod-delete experiment.
- Introduce new environment variables for infrastructure configuration.
- Refactor test setup to include infrastructure connection checks and cleanup.
- Update go.mod to include the necessary Litmus SDK dependency.

Signed-off-by: [Your Name] [Your Email]
Signed-off-by: Sky Singh <[email protected]>
…K for experiment execution

- Integrate the use of the Litmus SDK for creating and running the pod-delete experiment.
- Refactor test setup to include infrastructure connection and experiment request construction.
- Add new helper function to construct experiment requests with proper manifest YAML.
- Update go.mod and go.sum to include the latest version of github.com/google/uuid.

Signed-off-by: Sky Singh <[email protected]>
…n to fetch template from external source

- Introduce HTTP fetching of the engine template for the pod-delete experiment.
- Update the manifest construction logic to modify the fetched YAML template with experiment details.
- Replace the placeholder manifest with a dynamic template that adapts based on input parameters.
- Enhance error handling for template fetching and parsing.

Signed-off-by: Sky Singh <[email protected]>
…ion and dependency updates

- Add support for the `github.com/google/uuid` package to generate unique identifiers in experiments.
- Update experiment test files to utilize the Litmus SDK for constructing and executing chaos experiments.
- Refactor YAML manifest construction logic to dynamically set experiment parameters based on input details.
- Introduce new environment variables and enhance error handling for manifest generation.
- Update `go.mod` to include the latest version of `github.com/google/uuid` and `sigs.k8s.io/yaml`.

Signed-off-by: Sky Singh <[email protected]>
…ment with new module

- Introduce a new `infrastructure` package to handle setup and disconnection of infrastructure for chaos experiments.
- Update experiment test files to utilize the new infrastructure management functions, improving code clarity and maintainability.
- Add new environment variables for controlling infrastructure installation and usage of existing infrastructure.
- Update `go.mod` and `go.sum` to reflect changes in dependencies.

Signed-off-by: Sky Singh <[email protected]>
…rastructure management module

- Update all experiment test files to replace direct SDK infrastructure connection logic with the new `infrastructure` module for setup and cleanup.
- Enhance code clarity and maintainability by centralizing infrastructure management.
- Validate infrastructure ID after setup to ensure proper connection.
- Ensure consistent error handling across all experiments during infrastructure operations.

Signed-off-by: Sky Singh <[email protected]>
… final phase string

- Replace usage of final status object with final phase string for polling experiment run status.
- Enhance validation checks to ensure final phase is not empty and equals "Completed" after polling.
- Improve code clarity and maintainability by simplifying the status handling in both experiment tests.

Signed-off-by: Sky Singh <[email protected]>
SkySingh04 added 20 commits May 30, 2025 20:02
- Renamed multiple `LITMUS_PROBE_NAME` variables to include a version suffix "2" for consistency across chaos experiment jobs.
- This change ensures unique identification of probes and aligns with recent updates in the deployment configurations.

Signed-off-by: Sky Singh <[email protected]>
…Actions workflow

- Removed an incomplete line for `APP_LABEL` in the workflow configuration.
- Ensured proper formatting and consistency in the environment variable definitions for the Litmus deployment.

Signed-off-by: Sky Singh <[email protected]>
…ents

- Introduced new `APP_LABEL` environment variable in the GitHub Actions workflow for various chaos experiments to enhance identification and management.
- Updated the `GetDefaultExperimentConfig` function to retrieve `APP_LABEL` from environment variables, ensuring flexibility and consistency across deployments.

Signed-off-by: Sky Singh <[email protected]>
…ions workflow

- Renamed multiple `LITMUS_PROBE_NAME` variables from "ci-http-probe-2" to "ci-http-probe-3" across various jobs to reflect the new versioning scheme.
- This change ensures unique identification of probes and maintains consistency in chaos experiment configurations.

Signed-off-by: Sky Singh <[email protected]>
… jobs

- Eliminated the redundant uninstall step for Litmus across various chaos experiment jobs in the GitHub Actions workflow to streamline the process.
- This change enhances the efficiency of the workflow by reducing unnecessary commands during execution.

Signed-off-by: Sky Singh <[email protected]>
…ions workflow

- Renamed multiple `LITMUS_PROBE_NAME` variables from "ci-http-probe-3" to "ci-http-probe-4" across various jobs to reflect the new versioning scheme.
- This change ensures unique identification of probes and maintains consistency in chaos experiment configurations.

Signed-off-by: Sky Singh <[email protected]>
…Hub Actions workflow

- Replaced GitHub secrets with hardcoded values for `LITMUS_ENDPOINT`, `LITMUS_USERNAME`, `LITMUS_PASSWORD`, and `LITMUS_PROJECT_ID` across multiple jobs to facilitate local development and testing.
- Adjusted `LITMUS_PROBE_NAME` variables to remove version suffixes for clarity and consistency in chaos experiment configurations.

Signed-off-by: Sky Singh <[email protected]>
…ub Actions workflow

- Updated the `INFRA_SCOPE` environment variable from "namespace" to "cluster" across multiple jobs to reflect the new infrastructure configuration for chaos experiments.
- This change ensures that the chaos experiments are executed at the cluster level, enhancing the scope of testing.

Signed-off-by: Sky Singh <[email protected]>
…ments

- Introduced new environment variables `SOCKET_PATH` and `CONTAINER_RUNTIME` in the GitHub Actions workflow to support containerd as the runtime for chaos experiments.
- Updated the `GetDefaultExperimentConfig` function to retrieve these values from the environment, ensuring flexibility in configuration.
- Adjusted experiment manifests to utilize the new variables for socket path and container runtime, enhancing the adaptability of chaos experiments.

Signed-off-by: Sky Singh <[email protected]>
- Replaced hardcoded values for `LITMUS_ENDPOINT`, `LITMUS_USERNAME`, `LITMUS_PASSWORD`, and `LITMUS_PROJECT_ID` with GitHub secrets to enhance security and protect sensitive information.
- This change improves the security posture of the workflow by ensuring that sensitive credentials are not exposed in the codebase.

Signed-off-by: Sky Singh <[email protected]>
- Changed the `LITMUS_PROBE_NAME` variable from "ci-http-probe-container-kill" to "ci-http-probe-99-container-kill" in the GitHub Actions workflow to reflect the new naming convention.
- This update ensures consistency in probe identification for chaos experiments.

Signed-off-by: Sky Singh <[email protected]>
…esting

- Replaced GitHub secrets for `LITMUS_ENDPOINT`, `LITMUS_USERNAME`, `LITMUS_PASSWORD`, and `LITMUS_PROJECT_ID` with hardcoded values to facilitate local development and testing.
- Updated the `LITMUS_PROBE_NAME` from "ci-http-probe-99-container-kill" to "ci-http-probe-container-kill" for consistency in chaos experiment configurations.

Signed-off-by: Sky Singh <[email protected]>
- Changed the `LITMUS_PROBE_NAME` from "ci-http-probe-container-kill" to "ci-http-probe-9-container-kill" in the GitHub Actions workflow to align with the new naming convention.
- This update ensures consistency in probe identification for chaos experiments.

Signed-off-by: Sky Singh <[email protected]>
…itHub Actions

- Replaced hardcoded values for `LITMUS_ENDPOINT`, `LITMUS_USERNAME`, `LITMUS_PASSWORD`, and `LITMUS_PROJECT_ID` with GitHub secrets to enhance security.
- Updated the `LITMUS_PROBE_NAME` from "ci-http-probe-9-container-kill" to "ci-http-probe-99-container-kill" for consistency in chaos experiment configurations.

Signed-off-by: Sky Singh <[email protected]>
- Changed the `LITMUS_PROBE_NAME` from "ci-http-probe-99-container-kill" to "ci-http-probe-88-container-kill" in the GitHub Actions workflow to align with the new naming convention.
- This update ensures consistency in probe identification for chaos experiments.

Signed-off-by: Sky Singh <[email protected]>
- Added configuration for `configMaps`, `secrets`, and `hostFileVolumes` to the workflow, allowing for better management of resources during chaos experiments.
- Updated `securityContext` to enable privileged access and additional capabilities, enhancing the flexibility and control over the execution environment.

Signed-off-by: Sky Singh <[email protected]>
- Changed the `LITMUS_PROBE_NAME` from "ci-http-probe-88-container-kill" to "ci-http-probe-78-container-kill" in the GitHub Actions workflow to maintain alignment with the updated naming convention.
- This update ensures consistency in probe identification for chaos experiments.

Signed-off-by: Sky Singh <[email protected]>
…fy probe name

- Replaced GitHub secrets for `LITMUS_ENDPOINT`, `LITMUS_USERNAME`, `LITMUS_PASSWORD`, and `LITMUS_PROJECT_ID` with hardcoded values to facilitate local development and testing.
- Updated the `LITMUS_PROBE_NAME` from "ci-http-probe-78-container-kill" to "ci-http-probe-container-kill" for consistency in chaos experiment configurations.

Signed-off-by: Sky Singh <[email protected]>
…files

- Added spaces after comment slashes for better readability in various experiment test files, including container-kill, disk-fill, node-cpu-hog, and others.
- Ensured consistent formatting across all test files to enhance code clarity and maintainability.

Signed-off-by: Sky Singh <[email protected]>
…ent test code

- Changed the release workflow trigger from `create` to `push` for tags matching the pattern `v*`, ensuring proper versioning during releases.
- Removed unused code and debug log functions from the container-kill experiment test file to enhance clarity and maintainability.
- Updated default application namespace and label in the experiment configuration for better alignment with standard practices.

Signed-off-by: Sky Singh <[email protected]>
@Jonsy13
Copy link
Contributor

Jonsy13 commented Jun 2, 2025

@SkySingh04
Can you please update go version for PR checks in build.yaml?
Looks like github workflow is not giving error even though logs suggest that they actually failed -

PreCheck -
Screenshot 2025-06-02 at 10 42 02 AM

Go/Docker Build -
Screenshot 2025-06-02 at 10 42 19 AM

- Updated the Go version from 1.14 to 1.24 in the build, push, and release workflows to ensure compatibility with the latest features and improvements.
- This change enhances the development environment and aligns with current best practices.

Signed-off-by: Sky Singh <[email protected]>
- Removed direct requirement for sigs.k8s.io/yaml v1.4.0 and added it as an indirect dependency to streamline module management.
- This change helps maintain cleaner dependency tracking in the go.mod file.

Signed-off-by: Sky Singh <[email protected]>
- Updated log formatting in the Kubectl function for better readability.
- Replaced deprecated string replacement method with `strings.ReplaceAll` in EditFile and EditKeyValue functions for improved performance.
- Enhanced error handling by wrapping deferred close calls in anonymous functions to ensure proper error logging in DownloadFile, createInfrastructureViaRegisterInfra, applyInfrastructureManifest, and checkInfrastructureStatusViaGraphQL functions.
- Simplified condition check in PodStatusCheck for better clarity.

Signed-off-by: Sky Singh <[email protected]>
- Introduced a step to download and verify Go module dependencies in the CI workflow, ensuring all required packages are available for the build process.
- This enhancement improves the reliability of the build environment by ensuring dependencies are properly managed.

Signed-off-by: Sky Singh <[email protected]>
…ncies

- Updated Ginkgo from v1.16.5 to v2.19.0 for improved features and compatibility.
- Added new indirect dependencies including `github.com/go-logr/logr`, `github.com/go-task/slim-sprig/v3`, and `github.com/google/pprof`.
- Ensured consistency in dependency management within go.mod and go.sum files.

Signed-off-by: Sky Singh <[email protected]>
- Introduced a new `.golangci.yml` file to configure GolangCI-Lint for the project.
- Set the version to 2 and disabled test file analysis to streamline linting processes.

Signed-off-by: Sky Singh <[email protected]>
@SkySingh04
Copy link
Contributor Author

@Jonsy13 Your requested changes have been made!

@Jonsy13 Jonsy13 merged commit cdbcbfc into litmuschaos:master Jun 3, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants