Skip to content

code-kern-ai/cicd-deployment-scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 Cannot retrieve latest commit at this time.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cicd-deployment-scripts

Scripts used for Kern AI CI/CD efforts.

Table of Contents

  • ACR: Delete Docker Image
  • ACR: Docker Push
  • ACR: Docker Push Release
  • ACR: Docker Push Test
  • Azure: Function App Deployment
  • GitHub: Delete Branch
  • GitHub: Release
  • GitHub: Validate Release
  • GitHub: Workflow Wait
  • K8: Apply
  • K8: Database Downgrade
  • K8: Database Upgrade
  • K8: Cluster Deploy
  • K8: Destroy
  • K8: Edit
  • K8: Execution Environments
  • K8: Inference Reload Config
  • K8: Release
  • K8: Restart
  • K8: Test
  • OpenTofu: Release
  • OpenTofu: Generate Docs
  • OpenTofu: Plan/Apply

GitHub Actions

ACR: Delete Docker Image

Workflow file: az_acr_delete.yml

Triggers:

  • workflow_call

Inputs:

  • image_tag

Description:

  • deletes Container Images specified by the workflow input

Jobs:

  • ACR: Delete Test Image
    • Configure branch name
    • Delete Test Container Image

ACR: Docker Push

Workflow file: az_acr_push.yml

Triggers:

  • workflow_dispatch
  • push

Description:

  • before pushing the Docker image, the branch name is resolved to replace / with - and the image is built with the resolved branch name
  • builds and deploys Docker images in multiple steps

Jobs:

  • Docker: Build & Push
    • Configure branch name
    • Build & Push <application-repo>:${{ matrix.platform }}-<feature-hotfix>
    • Build & Push <application-repo>:${{ matrix.platform }}-gpu

ACR: Docker Push Release

Workflow file: az_acr_release.yml

Triggers:

  • workflow_dispatch
  • pull_request_closed
  • release

Description:

  • builds and deploys Docker images in multiple steps

Jobs:

  • Docker: Build & Push
    • Build & Push <application-repo>:amd64
    • Build & Push <application-repo>:arm64
    • Build & Push <application-repo>:latest

ACR: Docker Push Test

Workflow file: az_acr_test.yml

Triggers:

  • pull_request_opened_synchronized

Outputs:

  • GH_REF_NAME

Description:

  • before pushing the Docker image, the branch name is resolved to replace / with - and the image is built with the resolved branch name
  • builds and deploys the test Docker Image used by the K8: Test workflow

Jobs:

  • Docker: Build & Push (Test)
    • Configure branch name
    • Build & Push <application-repo>:test-<feature-hotfix>

Azure: Function App Deployment

Workflow file: az_fnapp_deploy.yml

Triggers:

  • workflow_dispatch
  • push

Description:

  • builds and deploys the Azure Function App
  • currently used to deploy the self hosted GitHub Actions Runner Monitor

Jobs:

  • Azure: Build & Deploy Function App
    • Resolve Project Dependencies Using Pip
    • Run Azure Functions Action

GitHub: Delete Branch

Workflow file: gh_delete_branch.yml

Triggers:

  • pull_request_closed

Description:

  • calls ACR: Delete Docker Image job, targeting the tag :test-<feature/hotfix>
  • deletes the feature/hotfix branch Container Images (:<platform>-<feature/hotfix>)
  • deletes the feature/hotfix branch

Troubleshooting:

  • this job will fail when the feature/hotfix branch is deleted manually

Jobs:

  • call-az-acr-delete

  • ACR: Delete Branch Images

    • Delete Branch Container Image
  • GitHub: Delete Branch

    • Delete Branch

GitHub: Release

Workflow file: gh_release.yml

Triggers:

  • release

Inputs:

  • deployment_status

Description:

  • publishes a release on GitHub with the tag generated by the pre-release that triggered this workflow
  • deletes a pre-release on GitHub with the tag generated by the pre-release that triggered this workflow
  • runs in case of a release deployment failure

Troubleshooting:

  • after fixing the error that caused the release deployment failure, recreate the pre-release to trigger the release deployment again

Jobs:

  • GitHub: Publish Release

    • Publish Release
  • GitHub: Delete Prerelease

    • Delete Prerelease

GitHub: Validate Release

Workflow file: gh_validate_release.yml

Triggers:

  • release

Description:

  • validates the release tag generated by the pre-release that triggered this workflow, using a RegEx check for semantic versioning

Troubleshooting:

  • inspect the pre-release tag name and ensure it follows the RegEx check for semantic versioning

Jobs:

  • GitHub: Validate Release
    • Validate Release Tag

GitHub: Workflow Wait

Workflow file: gh_wait_workflow.yml

Triggers:

  • workflow_call

Description:

  • waits for a workflow to complete
  • workflow is specified by the calling repository GitHub Actions Environment Variable

Jobs:

  • GitHub: Workflow Wait
    • Workflow Wait - refinery-gateway

K8: Apply

Workflow file: k8s_apply.yml

Triggers:

  • pull_request_closed (dev)
  • workflow_dispatch

Description:

  • generates a Kubernetes kustomization diff and applies it to the cluster
  • differs from the k8s-deploy job in that it applies the entire namespace, as opposed to application specific configurations

Jobs:

  • K8: Apply
    • Generate Kustomization
    • Apply Kustomization

K8: Database Downgrade

Workflow file: k8s_db_downgrade.yml

Triggers:

  • workflow_call

Inputs:

  • alembic_downgrade_rev
  • docker_image_tag

Description:

  • runs alemic downgrade on the application that triggered this workflow
  • uses a revision number input generated by the K8: Database Upgrade workflow to downgrade the database

Jobs:

  • K8: Database Downgrade
    • Apply Alembic Rollback

K8: Database Upgrade

Workflow file: k8s_db_upgrade.yml

Triggers:

  • workflow_call

Inputs:

  • alembic_upgrade_rev
  • docker_image_tag

Outputs:

  • alembic_current_rev
  • alembic_upgraded_rev

Description:

  • runs alemic upgrade on the application that triggered this workflow
  • if an application that depends on refinery-gateway database changes (e.g. refinery-tokenizer) triggers this workflow, the alembic upgrade will be run on the refinery-gateway database

Jobs:

  • K8: Database Upgrade
    • Apply Alembic Migrate

K8: Cluster Deploy

Workflow file: k8s_deploy.yml

Triggers:

  • workflow_call

Inputs:

  • environment

Outputs:

  • deployment_status

Description:

  • deploys the application to the Kubernetes cluster
  • differs from the k8s-apply job in that it applied application specific configurations, as opposed to the entire namespace
  • uses Canary Deployment strategy

Jobs:

  • K8: Deploy
    • Generate Kustomization
    • Generate Deployment
    • Assert Deployment Success
    • Promote Deployment
    • Reject Deployment

K8: Destroy

Workflow file: k8s_destroy.yml

Triggers:

  • workflow_dispatch

Description:

  • deletes all deployment and service Kubernetes resources in the namespace configured by GitHub Actions Environment Variables

Jobs:

  • K8: Destroy Cluster Namespace
    • Destroy Cluster Namespace

K8: Edit

Workflow file: k8s_edit.yml

Triggers:

  • pull_request_closed

Description:

  • updates the Kubernetes deployment image tags to the latest release
  • creates a new branch automated-release-dev and a corresponding Pull Request in k8-cluster-cognition repository
  • when a Pull Request already exists, deployment image tag updates are accumulated on the existing Pull Request

Jobs:

  • K8: Edit
    • Perform Edit/Git Operations

K8: Execution Environments

Workflow file: k8s_exec_env_pull.yml

Triggers:

  • workflow_dispatch

Description:

  • pulls execution environment images inside the Kubernetes cluster

Jobs:

  • K8: Docker Pulls
    • Execute docker pull

K8: Inference Reload Config

Workflow file: k8s_inference_reload_config.yml

Triggers:

  • workflow_dispatch

Description:

  • invokes /reload_last_config endpoint on all available (gates) inference services

Jobs:

  • K8: Inference - Reload Last Config
    • Execute Inference Reload Config

K8: Release

Workflow file: k8s_release.yml

Triggers:

  • pull_request_closed
  • release

Description:

  • calls GitHub: Validate Release job
  • calls ACR: Docker Push Release job
  • calls K8: Edit job
  • calls GitHub: Release job
  • forwards deployment status to the GitHub: Release job
  • calls GitHub: Delete Branch job

Jobs:

  • call-gh-validate-release

  • call-az-acr-release

  • call-k8-edit

  • call-gh-release

  • call-gh-delete-branch

K8: Restart

Workflow file: k8s_restart.yml

Triggers:

  • workflow_dispatch

Inputs:

  • deployment_name

Description:

  • restarts a deployment in the Kubernetes cluster, specified by the workflow input

Jobs:

  • K8: Restart
    • Run Restart

K8: Test

Workflow file: k8s_test.yml

Triggers:

  • pull_request_opened_synchronized

Inputs:

  • test_cmd

Description:

  • calls ACR: Docker Push Test job
  • runs alemic upgrade on the application that triggered this workflow
  • if an application that depends on refinery-gateway database changes (e.g. refinery-tokenizer) triggers this workflow, the alembic upgrade is run on the refinery-gateway database if the same test Docker Image tag exists
  • uses the test Docker Image generated by the ACR: Docker Push Test job to run tests in the Kubernetes cluster
  • uses the revision number generated in the first step to downgrade the database

Troubleshooting:

  • in case of a failed test, inspect the logs of this job to identify the issue and resolve it by updating the application code
  • in case this workflow corrupted app.dev.kern.ai, manually run K8: Apply in k8-cluster-cognition to apply the latest container images available on dev
  • in case of a workflow failure (TBD), ignore the failure and proceed with Pull Request merge

Jobs:

  • call-az-acr-push-test

  • K8: Test

    • Run Test

OpenTofu: Release

Workflow file: release_please.yml

Triggers:

  • workflow_call

Description:

  • generates a release Pull Request with CHANGELOG updates for the calling repository
  • requires Conventional Commits

Jobs:

  • tf-module-release
    • googleapis/release-please-action@v4

OpenTofu: Generate Docs

Workflow file: tf_docs.yml

Triggers:

  • push

Description:

  • generates documentation for the OpenTofu module

Jobs:

  • tf-module-docs
    • actions/checkout@v4
    • Render OpenTofu docs and push changes back to PR

OpenTofu: Plan/Apply

Workflow file: tf_plan_apply.yml

Triggers:

  • workflow_dispatch
  • push

Outputs:

  • tf_plan_exit_code
  • tf_destroy

Description:

  • executes tofu plan on the repository that triggered this workflow
  • creates a destruction plan when the calling repository's GitHub Actions Environment Variable TF_DESTROY is set to -destroy
  • executes tofu apply on the repository that triggered this workflow, assuming that the tofu plan job has succeeded

Troubleshooting:

  • inspect the logs of the tofu plan job to identify the issue and resolve it by updating Infrastructure as Code (IaC) files
  • inspect the logs of the tofu plan job to identify the issue and resolve it by updating Infrastructure as Code (IaC) files

Jobs:

  • OpenTofu Plan

    • OpenTofu Plan
  • OpenTofu Apply

    • OpenTofu Apply

About

Scripts used for Kern AI CI/CD efforts

Resources

Stars

Watchers

Forks

Packages

No packages published