GitHub Actions Workflows Guide

This guide explains how to use the GitHub Actions CI/CD workflows for Enclave Lab.

Overview

Enclave Lab uses GitHub Actions for automated testing and validation with four main workflows:

PR Validation - Fast code quality checks on every PR
Infrastructure Verification - Test infrastructure setup
E2E Connected Mode - Full cluster deployment testing
Cleanup - Infrastructure maintenance

Workflow 1: PR Validation

Purpose: Fast code quality checks that run on every pull request

Trigger: Automatic on PR open, update, or reopen

Duration: ~5-10 minutes

Runs on: GitHub-hosted runners (no CI machine needed)

What It Checks

✅ Shell script syntax (shellcheck)
✅ YAML formatting (yamllint)
✅ Ansible playbook syntax (ansible-lint)
✅ Makefile syntax

How to Use

This workflow runs automatically on every PR. No manual action needed.

To test locally before pushing:

make validate

Viewing Results

Go to your PR on GitHub
Click "Checks" tab
View "Code Quality Checks" results
If failed, click to see detailed error messages

Fixing Failures

Common fixes:

# Fix shell script issues
shellcheck scripts/*.sh

# Fix YAML formatting
yamllint .

# Fix Ansible issues
ansible-lint playbooks/

# Fix all at once
make validate

Workflow 2: Infrastructure Verification

Purpose: Test infrastructure creation without full cluster deployment

Trigger:

Manual dispatch (Actions tab)
PR with test-infra label

Duration: ~60-75 minutes

Runs on: Self-hosted runner (CI machine)

What It Does

✅ Pre-flight checks (resources, permissions)
✅ Create infrastructure (make environment)
✅ Provision Landing Zone (make provision-landing-zone)
✅ Install Enclave Lab in connected mode (make install-enclave)
✅ Verify installation
✅ Collect artifacts (logs, environment metadata)
✅ Optional cleanup

How to Use

Option 1: Manual Dispatch

Go to Actions tab on GitHub
Select "Infrastructure Verification"
Click "Run workflow"
Choose cleanup strategy:
- on_failure (recommended): Clean up only if test fails
- always: Clean up after every run
- never: Keep infrastructure for debugging

Option 2: PR Label

Add label test-infra to your PR
Workflow runs automatically
Remove label to prevent re-runs

Cleanup Strategies

on_failure (recommended): Keeps successful infrastructure for reuse, cleans failures
always: Always cleans up, ensures fresh state
never: Manual cleanup required, good for debugging

Viewing Results

Go to Actions tab
Click on the workflow run
View step-by-step progress
Download artifacts (logs, environment.json)

Artifacts Collected

environment.json - Infrastructure metadata
vm-status.txt - Virtual machine status
network-status.txt - Network configuration
deployment.log - Enclave Lab installation log

Workflow 3: E2E Connected Mode

Purpose: Full end-to-end cluster deployment testing

Trigger:

Manual dispatch (Actions tab)
PR with test-e2e label
Weekly schedule (Sunday 2 AM UTC)

Duration: ~90-120 minutes

Runs on: Self-hosted runner (CI machine)

What It Does

✅ Pre-flight checks
✅ Create/reuse infrastructure
✅ Provision Landing Zone
✅ Install Enclave Lab (connected mode)
✅ Deploy OpenShift cluster (make deploy-cluster)
✅ Verify cluster health
✅ Collect artifacts (kubeconfig, logs)
✅ Optional cleanup

How to Use

Option 1: Manual Dispatch

Go to Actions tab
Select "E2E Connected Mode"
Click "Run workflow"
Configure options:
- cleanup_strategy: on_failure / always / never
- reuse_infrastructure: true / false

Option 2: PR Label

Add label test-e2e to your PR
Workflow runs automatically

Option 3: Scheduled

Runs automatically every Sunday at 2 AM UTC for regression testing.

Reuse Infrastructure

Enabled (recommended):

Reuses existing VMs if available
Faster runs
Good for iterative testing

Disabled:

Creates fresh infrastructure every time
Slower but ensures clean state
Good for validating from scratch

Cluster Verification

The workflow checks:

✅ Nodes are ready
✅ Cluster operators are available
✅ No degraded operators
✅ Kubeconfig is accessible

Artifacts Collected

environment.json - Infrastructure metadata
vm-status.txt - Virtual machine status
network-status.txt - Network configuration
deployment.log - Deployment logs
kubeconfig - Cluster access configuration
config/global.yaml - Enclave Lab configuration

Accessing the Cluster

After successful E2E run:

Download kubeconfig artifact

Use it to access the cluster:

export KUBECONFIG=./kubeconfig
oc get nodes
oc get co

Workflow 4: Cleanup

Purpose: Infrastructure cleanup and maintenance

Trigger:

Manual dispatch (Actions tab)
Weekly schedule (Sunday 4 AM UTC)

Duration: ~5-15 minutes

Runs on: Self-hosted runner (CI machine)

Cleanup Levels

Standard (default):

Runs make clean
Removes Enclave test infrastructure
Quick and safe

Deep:

Standard cleanup
Force destroy remaining VMs
Remove networks
Clean working directory

Full:

Deep cleanup
Stop sushy-tools containers
Remove libvirt pools
Clean dangling interfaces
Nuclear option for stuck infrastructure

How to Use

Manual Cleanup:

Go to Actions tab
Select "Cleanup Infrastructure"
Click "Run workflow"
Choose cleanup level

Scheduled Cleanup:

Runs automatically every Sunday at 4 AM UTC with standard cleanup.

When to Use Each Level

Standard: Regular cleanup after tests
Deep: Infrastructure is stuck or behaving oddly
Full: Complete reset needed, things are really broken

Common Workflows

Testing a PR

Create PR
PR Validation runs automatically
If changes are significant, add test-infra label
If testing cluster deployment, add test-e2e label

Debugging Failed Tests

Check workflow logs in Actions tab
Download artifacts
Look for error messages in logs
If infrastructure is stuck, run cleanup workflow

Weekly Regression Testing

E2E workflow runs automatically every Sunday at 2 AM UTC. Reviews results Monday morning.

Infrastructure Maintenance

Cleanup workflow runs automatically every Sunday at 4 AM UTC. Prevents resource accumulation.

Best Practices

For Developers

✅ DO:

Run make validate locally before pushing
Use test-infra label for infrastructure changes
Use test-e2e label sparingly (long running)
Check workflow results before asking for review

❌ DON'T:

Push without running validation
Add test-e2e to every PR (expensive)
Leave failed workflows without investigation
Ignore cleanup failures

For Reviewers

✅ DO:

Check that PR Validation passed
Request test-infra for infrastructure changes
Request test-e2e for significant changes
Review workflow artifacts if tests fail

Troubleshooting

See CI_TROUBLESHOOTING.md for detailed troubleshooting guide.

Quick Fixes

Workflow stuck in queue:

Check if another workflow is running
Workflows queue on shared CI machine

Workflow failed at pre-flight:

Check GitHub secrets are configured
Verify runner is online (Settings → Actions → Runners)

Infrastructure creation failed:

Check CI machine has enough resources
Run cleanup workflow (deep level)

Can't access cluster:

Download kubeconfig artifact
Check deployment.log for errors
SSH to Landing Zone for debugging

Monitoring

Check Runner Status

Settings → Actions → Runners

Runner should show "Idle" when not running workflows
If "Offline", check runner service on CI machine

Check Workflow History

Actions tab → Select workflow → View runs

See success/failure trends
Download artifacts from past runs
Review timing data

Resource Usage

Monitor CI machine:

# Disk space
df -h /opt/dev-scripts

# RAM usage
free -g

# Running VMs
sudo virsh list --all

# Active networks
sudo virsh net-list --all

Next Steps

CI Runner Setup - Set up the self-hosted runner
CI Troubleshooting - Fix common issues
Deployment Guide - Full deployment documentation

FilesExpand file tree

CI_WORKFLOWS.md

Latest commit

History

CI_WORKFLOWS.md

File metadata and controls

GitHub Actions Workflows Guide

Overview

Workflow 1: PR Validation

What It Checks

How to Use

Viewing Results

Fixing Failures

Workflow 2: Infrastructure Verification

What It Does

How to Use

Cleanup Strategies

Viewing Results

Artifacts Collected

Workflow 3: E2E Connected Mode

What It Does

How to Use

Reuse Infrastructure

Cluster Verification

Artifacts Collected

Accessing the Cluster

Workflow 4: Cleanup

Cleanup Levels

How to Use

When to Use Each Level

Common Workflows

Testing a PR

Debugging Failed Tests

Weekly Regression Testing

Infrastructure Maintenance

Best Practices

For Developers

For Reviewers

Troubleshooting

Quick Fixes

Monitoring

Check Runner Status

Check Workflow History

Resource Usage

Next Steps