Skip to content

Commit 00c5981

Browse files
authored
feat: add github-actions-debugger skill (#750)
# Pull Request Description This PR adds the `github-actions-debugger` skill. This skill is designed to act as an expert CI/CD diagnostician, specifically focused on reading raw logs from failed GitHub Actions, identifying root causes (like dependency mismatches, missing secrets, caching issues, or deprecated actions), and proposing precise YAML or code changes to fix the pipeline.
1 parent eca74e2 commit 00c5981

1 file changed

Lines changed: 99 additions & 0 deletions

File tree

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
---
2+
name: github-actions-debugger
3+
description: "Specialized skill for diagnosing, analyzing, and fixing failing GitHub Actions workflows by parsing run logs and pipeline definitions."
4+
category: devops
5+
risk: safe
6+
source: community
7+
source_type: community
8+
date_added: "2026-06-25"
9+
author: Owais
10+
tags: [github-actions, ci-cd, devops, debugging, workflows]
11+
tools: [claude, cursor, gemini, antigravity]
12+
---
13+
14+
# GitHub Actions Pipeline Debugger
15+
16+
## Overview
17+
18+
This skill is designed to act as an expert CI/CD diagnostician. It focuses specifically on reading raw logs from failed GitHub Actions, identifying the root cause of the crash or failure, and outputting the precise YAML or code changes required to fix the pipeline.
19+
20+
## When to Use
21+
22+
- Use when a GitHub Actions workflow fails unexpectedly and the error log is long, obscure, or misleading.
23+
- Use when debugging dependency mismatch errors, missing secrets, caching issues, or runner environment problems in CI.
24+
- Use to optimize slow pipelines by identifying bottlenecks in workflow steps.
25+
- Use to update and modernize deprecated actions or workflow syntax.
26+
27+
## How It Works
28+
29+
1. **Log Ingestion & Redaction:** Analyze the provided GitHub Actions workflow log (often exported as a raw text file or pasted directly). **CRITICAL SAFETY REQUIREMENT:** The user/agent must redact all sensitive credentials, secrets, tokens, private keys, and internal system paths from the logs before pasting or uploading them.
30+
2. **Context Mapping:** Cross-reference the failure point with the specific step and job in the `.github/workflows/*.yml` definition.
31+
3. **Root Cause Analysis:** Identify if the failure is due to:
32+
- Missing or misconfigured secrets (`${{ secrets.API_KEY }}`).
33+
- Node/Python/OS environment version mismatches.
34+
- Flaky tests or timeout limits.
35+
- Syntax errors in bash scripts run within the `run:` block.
36+
- Invalid action versions or deprecated actions.
37+
4. **Resolution Proposal:** Provide a direct `diff` of the `.yml` file or the underlying script that needs to be modified.
38+
39+
## Best Practices
40+
41+
- **Provide Full Context:** Always review both the workflow definition (`.yml` file) and the failure log simultaneously to ensure accurate diagnosis.
42+
- **Check Action Versions:** Many failures are caused by deprecated runtime versions (e.g., Node.js 16) in older third-party actions (e.g., `actions/checkout@v2`). Always recommend upgrading to the latest major versions (e.g., `v4`).
43+
- **Permissions Audit:** Ensure the workflow has the correct `permissions:` block if it's attempting to write to the repository, packages, or deploy environments.
44+
- **Reproducibility:** If a test fails in CI but passes locally, investigate environment differences such as timezone, headless browser state, memory limits, or parallel execution race conditions.
45+
46+
## Examples
47+
48+
### Example 1: Fixing a Deprecated Node.js Action Version Error
49+
**Failing Log:**
50+
```text
51+
Warning: The Go/Node.js/Python version used by this action is deprecated.
52+
Error: Node.js 16 actions are deprecated. Please update to use Node.js 20.
53+
```
54+
55+
**Proposed Fix Diff:**
56+
```diff
57+
- name: Checkout Code
58+
- uses: actions/checkout@v2
59+
+ uses: actions/checkout@v4
60+
```
61+
62+
### Example 2: Diagnosing and Fixing a Missing Repository Secret
63+
**Failing Log:**
64+
```text
65+
Run npm run deploy
66+
npm run deploy
67+
shell: /usr/bin/bash -e {0}
68+
Error: API Key is required for deployment. Process exited with code 1.
69+
```
70+
71+
**Proposed Fix Diff:**
72+
```diff
73+
- name: Deploy App
74+
run: npm run deploy
75+
+ env:
76+
+ DEPLOY_API_KEY: ${{ secrets.DEPLOY_API_KEY }}
77+
```
78+
79+
## Security & Safety Notes
80+
81+
- **Credential Exposure & Raw Log Redaction**: Under no circumstances should raw logs containing unmasked secrets, private URLs, deployment targets, or tokens be processed without prior redaction. Always ensure the user or agent redacts all sensitive info before ingestion.
82+
- **Dry-Run Mode**: When recommending modifications to bash script steps inside workflows, ensure you suggest adding flags like `--dry-run` or staging execution where possible to prevent unintended side effects in downstream environments during debugging.
83+
84+
## Limitations
85+
86+
- The skill cannot securely read repository secrets. It can only infer missing or malformed secrets if the log complains about undefined environment variables or authentication failures.
87+
- It cannot execute the GitHub action itself to test the fix; validation requires pushing the proposed fix to the repository and triggering a workflow run.
88+
- Network-related transient failures (e.g., a package registry being down temporarily) might be incorrectly diagnosed as structural workflow issues if not carefully analyzed.
89+
90+
## Common Pitfalls
91+
92+
- **Ignoring Transient Failures**: Mistaking temporary network dropouts or registry downtime (e.g., npm or pip install errors) for actual code or configuration bugs. Always check if a rerun succeeds before attempting heavy changes.
93+
- **Hardcoding Tokens**: Fixing authentication errors by hardcoding secrets or API tokens directly into the YAML files instead of utilizing GitHub Secrets (`${{ secrets.SECRET_NAME }}`).
94+
- **Overlooking Caching Side Effects**: Forgetting that outdated cache keys can keep corrupt dependencies loaded. If dependency installation is failing, try running a job with actions caching bypassed.
95+
96+
## Related Skills
97+
98+
- `@devops-troubleshooter` - General DevOps and infrastructure issue resolution.
99+
- `@cicd-automation-workflow-automate` - For creating new CI/CD pipelines from scratch.

0 commit comments

Comments
 (0)