Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 21 additions & 3 deletions .github/actions/create-issue/entrypoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,10 +95,16 @@ def create_github_issue(


def update_github_issue(
issue_number: int, vulnerability: dict, base_url: str, headers: dict, timestamp: str
issue_number: int,
vulnerability: dict,
base_url: str,
headers: dict,
timestamp: str,
assignees: str = "",
):
"""
Updates an existing GitHub issue by posting a comment with new scan results.
Updates an existing GitHub issue by posting a comment with new scan results
and replacing the assignee.
"""
artifact_id = vulnerability["ArtifactID"]
vulnerability_text = vulnerability.get("vulnerabilities", [])
Expand All @@ -122,6 +128,18 @@ def update_github_issue(
)
}

# Update assignee on the issue (replaces existing assignees)
issue_url = f"{base_url}/{issue_number}"
patch_body = {"assignees": [assignees] if assignees else []}
r_patch = requests.patch(issue_url, headers=headers, json=patch_body)
if r_patch.ok:
print(f"Updated assignee on issue #{issue_number} for {artifact_id}")
else:
print(
f"Failed to update assignee on issue #{issue_number}. "
f"Status: {r_patch.status_code}. Response: {r_patch.text}"
)

comment_url = f"{base_url}/{issue_number}/comments"
r = requests.post(comment_url, headers=headers, json=comment_body)

Expand Down Expand Up @@ -201,7 +219,7 @@ def main():
create_github_issue(vulnerability, url, headers, timestamp, labels, assignees)
elif result["status"] == "exists":
issue_number = result["issue_number"]
update_github_issue(issue_number, vulnerability, url, headers, timestamp)
update_github_issue(issue_number, vulnerability, url, headers, timestamp, assignees)


if __name__ == "__main__":
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/generate-image-list.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ jobs:
generate-image-list:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: Checkout repo
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
submodules: recursive
token: ${{ secrets.GH_PAT || secrets.GITHUB_TOKEN }}
Expand All @@ -31,7 +32,8 @@ jobs:
ref: ${{ inputs.vuln-scanning-ref }}
path: .vuln-scanning

- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6
with:
python-version: '3.x'

Expand Down
63 changes: 54 additions & 9 deletions .github/workflows/vuln-scanning.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,21 +64,47 @@
continue-on-error: true
run: |
ENV_FILE="${{ inputs.env-file-path }}"
IMAGES=$(grep "IMAGE" "$ENV_FILE" | cut -d '=' -f2)
# *** FIX IS HERE ***
SCAN_RESULTS_JSON="${{ inputs.scan-report }}" # Use the input variable
SCAN_RESULTS_JSON="${{ inputs.scan-report }}"

Check warning

Code scanning / CodeQL

Code injection Medium

Potential code injection in
${ inputs.scan-report }
, which may be controlled by an external user.
echo "[]" > "$SCAN_RESULTS_JSON"
for IMAGE in $IMAGES; do
# Helper: run trivy on IMAGE, append result to SCAN_RESULTS_JSON,
# optionally override the ArtifactID in the JSON (for built images).
scan_image() {
local IMAGE="$1"
local ARTIFACT_ID="${2:-$IMAGE}"
echo "Scanning $IMAGE..."
local FILENAME
FILENAME=$(echo "$IMAGE" | sed 's/[\/:]/_/g')
SCAN_FILE="${FILENAME}-scan.json"
local SCAN_FILE="${FILENAME}-scan.json"
trivy image --format json --scanners vuln --severity CRITICAL,HIGH "$IMAGE" > "$SCAN_FILE" || true
RESULT=$(jq '{
ArtifactID: .ArtifactName,
vulnerabilities: [(.Results[] | select(.Vulnerabilities != null) | .Vulnerabilities[] |{Title:.Title,VulnerabilityID: .VulnerabilityID, Severity: .Severity})]
local RESULT
RESULT=$(jq --arg id "$ARTIFACT_ID" '{
ArtifactID: $id,
vulnerabilities: [(.Results[] | select(.Vulnerabilities != null) | .Vulnerabilities[] | {Title:.Title, VulnerabilityID:.VulnerabilityID, Severity:.Severity})]
}' "$SCAN_FILE")
jq --argjson new "$RESULT" '. + [$new]' "$SCAN_RESULTS_JSON" > tmp.json && mv tmp.json "$SCAN_RESULTS_JSON"
rm "$SCAN_FILE"
}
# --- Scan pre-built images (*_IMAGE variables) ---
grep "_IMAGE=" "$ENV_FILE" | cut -d '=' -f2 | while read -r IMAGE; do
scan_image "$IMAGE"
done
# --- Build and scan GitHub build-context images (*_BUILD variables) ---
grep "_BUILD=" "$ENV_FILE" | while IFS= read -r LINE; do
VAR_NAME=$(echo "$LINE" | cut -d '=' -f1)
BUILD_URL=$(echo "$LINE" | cut -d '=' -f2-)
# Skip comment lines
[[ "$VAR_NAME" == \#* ]] && continue
# Derive a local image tag from the variable name:
# e.g. PLATFORM_OUTPUT_SUPPORT_DATA_LOADER_DOCKERIZED_BUILD
# -> platform-output-support-data-loader-dockerized:local-scan
LOCAL_TAG=$(echo "$VAR_NAME" | tr '[:upper:]' '[:lower:]' | sed 's/_build$//' | sed 's/_/-/g'):local-scan
echo "Building $BUILD_URL as $LOCAL_TAG ..."
docker build "$BUILD_URL" -t "$LOCAL_TAG"
scan_image "$LOCAL_TAG" "$BUILD_URL"
docker rmi "$LOCAL_TAG" || true
done
- name: Run Snyk to check for vulnerabilities
if: env.SCANNER == 'snyk'
Expand All @@ -95,11 +121,30 @@
name: vuln-scan-results
path: ${{ inputs.scan-report }}

- name: Resolve assignee from rotation
id: resolve-assignee
run: |
EXPLICIT_ASSIGNEE='${{ inputs.assignee-user }}'

Check warning

Code scanning / CodeQL

Code injection Medium

Potential code injection in
${ inputs.assignee-user }
, which may be controlled by an external user.
if [ -n "$EXPLICIT_ASSIGNEE" ]; then
ASSIGNEE="$EXPLICIT_ASSIGNEE"
else
ROTATION='${{ vars.ROTATION_USERS }}'

Check warning

Code scanning / CodeQL

Code injection Medium

Potential code injection in
${ vars.ROTATION_USERS }
, which may be controlled by an external user.
if [ -n "$ROTATION" ] && [ "$ROTATION" != "null" ]; then
MONTH=$(date +%-m)
LENGTH=$(echo "$ROTATION" | jq 'length')
INDEX=$(( (MONTH - 1) % LENGTH ))
ASSIGNEE=$(echo "$ROTATION" | jq -r ".[$INDEX]")
else
ASSIGNEE=""
fi
fi
echo "assignee=$ASSIGNEE" >> $GITHUB_OUTPUT
- name: Run Create GitHub Projects Ticket Action
id: report-vulnerabilities
uses: thehyve/vulnerability-scanning/.github/actions/create-issue@main
with:
token: ${{ secrets.token }}
assignee: ${{ inputs.assignee-user }}
assignee: ${{ steps.resolve-assignee.outputs.assignee }}
tag: ${{ inputs.tag }}
file-name: ${{ inputs.scan-report }}
102 changes: 90 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Reusable GitHub Actions workflows to scan Docker images for vulnerabilities and
- [Reusable workflows](#reusable-workflows)
- [generate-image-list](#1-generate-image-list)
- [vuln-scanning](#2-vuln-scanning)
- [Monthly assignee rotation](#monthly-assignee-rotation)
- [scan_docker_images.py (local use)](#scan_docker_imagespy-local-use)
- [Variable naming convention](#variable-naming-convention)
- [Issue lifecycle](#issue-lifecycle)
Expand All @@ -26,18 +27,20 @@ This repository provides two independent reusable workflows:
┌─────────────────────────────────────────────┐
│ generate-image-list │
│ │
│ Compose files ─┐
│ Compose files ─┐ image: <ref>
│ Dockerfiles ├─ scan_docker_images.py ──►│──► docker_images.env
│ *.yml / *.yaml─┘ │ (committed to repo)
│ *.yml / *.yaml─┘ build: <github-url> │ (committed to repo)
└─────────────────────────────────────────────┘
│ (or bring your own docker_images.env)
┌─────────────────────────────────────────────┐
│ vuln-scanning │
│ │
│ docker_images.env ──► Trivy / Snyk scan ──►│──► GitHub Issues
│ │ (HIGH / CRITICAL)
│ (docker_images.env) │
│ *_IMAGE vars ──► trivy image <ref> │
│ ─────│──► GitHub Issues
│ *_BUILD vars ──► docker build <url> │ (HIGH / CRITICAL)
│ trivy image <local-tag> │
└─────────────────────────────────────────────┘
```

Expand All @@ -54,12 +57,23 @@ The workflows are designed to be used together or independently, depending on yo
1. **Image discovery** (`generate-image-list` workflow)
`scan_docker_images.py` walks the caller's repository and finds every Docker image reference in:
- `docker-compose.yml`, `*.yml`, `*.yaml` – lines matching `image: <ref>`
- `docker-compose.yml`, `*.yml`, `*.yaml` – `build: <github-url>` directives (string form and `context:` map form)
- `Dockerfile`, `Dockerfile.*` – lines matching `FROM <ref>`

References that use environment-variable substitution (`${VAR}`) are skipped because they are already sourced from the `.env` file. The discovered images are written to `docker_images.env` (or a custom path) and committed back to the caller's repository.
Variable substitutions (`${VAR}`) are handled differently depending on the source:
- **`image: ${VAR}` in Compose files** — skipped, because the actual image name is already stored in the `.env` file (that's the convention this workflow is built around).
- **`FROM ${VAR}` in local and remote Dockerfiles** — the script attempts to resolve the variable using `ARG NAME=default` declarations in the same file. References that remain unresolvable after substitution are skipped.

The discovered images are written to `docker_images.env` (or a custom path) and committed back to the caller's repository.

For `build:` entries pointing at a public GitHub repository, the script also fetches the remote `Dockerfile` and extracts the `FROM` base images from it, so those base images are scanned too. The build URL itself is recorded as a `*_BUILD` variable so the `vuln-scanning` workflow can build and scan the resulting image.

2. **Vulnerability scanning** (`vuln-scanning` workflow)
Reads a `docker_images.env` file containing `*_IMAGE` variables (produced by step 1, or maintained manually), iterates over every image, and scans each one with either **Trivy** or **Snyk**. Results are saved as a JSON artifact.
Reads a `docker_images.env` file and scans every entry:
- **`*_IMAGE` variables** — the image is pulled and scanned directly with Trivy.
- **`*_BUILD` variables** — the GitHub build-context URL is passed to `docker build`, and the resulting local image is scanned with Trivy. The issue title uses the original build URL for traceability.

Results are saved as a JSON artifact.

3. **Issue management** (`create-issue` action)
Compares the scan results against existing open GitHub Issues (filtered by a caller-supplied label). For each vulnerable image:
Expand Down Expand Up @@ -159,20 +173,52 @@ jobs:
| Input | Required | Default | Description |
|---|---|---|---|
| `scan-option` | Yes | `trivy` | Scanner to use: `trivy` or `snyk` |
| `env-file-path` | Yes | | Path to the `.env` file containing `*_IMAGE` variables |
| `env-file-path` | Yes | | Path to the `.env` file containing `*_IMAGE` and/or `*_BUILD` variables |
| `tag` | Yes | | Label applied to all created GitHub Issues (used to correlate issues between runs) |
| `assignee-user` | No | *(none)* | GitHub username to assign created issues to |
| `assignee-user` | No | *(none)* | GitHub username to assign to created and updated issues. Takes precedence over `ROTATION_USERS` when set (see [Monthly assignee rotation](#monthly-assignee-rotation)) |
| `scan-report` | No | `scan_result.json` | Path to write the raw scan output (JSON) |

**Secrets:**

| Secret | Required | Description |
|---|---|---|
| `token` | Yes | GitHub token used to create/update issues (typically `GITHUB_TOKEN`) |
| `token` | Yes | GitHub token used to create/update issues (typically the built-in `GITHUB_TOKEN`) |
| `snyk-token` | No | Snyk API token (required when `scan-option` is `snyk`) |

---

### 🔄 Monthly assignee rotation

When the workflow runs on a schedule (e.g. monthly), you can automatically rotate issue assignees without changing any workflow code. Define a repository variable named `ROTATION_USERS` containing a JSON array of GitHub usernames:

**Setup:**

1. Go to your repository → **Settings → Secrets and variables → Actions → Variables tab**
2. Click **New repository variable**
3. Name: `ROTATION_USERS`
4. Value: a JSON array of GitHub usernames, in the order you want them assigned:

```json
["alice", "bob", "carol", "dave"]
```

**How it works:**

At runtime the workflow evaluates `(current_month - 1) % array_length` to select the assignee:

| Month | Index (4-person example) | Assignee |
|---|---|---|
| January | 0 | `alice` |
| February | 1 | `bob` |
| March | 2 | `carol` |
| April | 3 | `dave` |
| May | 0 | `alice` (wraps around) |
| … | … | … |

If `assignee-user` is specified on the workflow call, it takes precedence over `ROTATION_USERS`. If neither is set, no assignee is applied.

---

## 🐍 `scan_docker_images.py` (local use)

The script can also be run locally from the root of any repository:
Expand All @@ -187,14 +233,17 @@ OUTPUT_FILE=infra/images.env python3 scan_docker_images.py

**What is scanned:**
- `*.yml` / `*.yaml` – `image: <ref>` directives (YAML anchors are supported)
- `Dockerfile` / `Dockerfile.*` – `FROM <ref>` directives (`scratch` and variable references are skipped)
- `*.yml` / `*.yaml` – `build: <github-url>` directives (string form and `context:` map form); the remote `Dockerfile` is fetched and its `FROM` base images are extracted (with `ARG` default resolution). The build URL is also recorded as a `*_BUILD` entry.
- `Dockerfile` / `Dockerfile.*` – `FROM <ref>` directives (`scratch` and unresolvable variable references are skipped; `ARG NAME=default` substitutions are resolved)

**Skipped directories:** `.git`, `.mypy_cache`, `node_modules`, `__pycache__`, `.venv`

---

## 🏷️ Variable naming convention

### `*_IMAGE` variables — pre-built images

Given an image reference the script derives an environment-variable name as follows:

| Step | Example |
Expand All @@ -218,14 +267,43 @@ CLICKHOUSE_SERVER_25_8_2_29_IMAGE=clickhouse/clickhouse-server:25.8.2.29

Version-prefix tags (e.g. `mysql:8` alongside `mysql:8.0`) are automatically merged — the shorter prefix is absorbed into the more specific tag.

### `*_BUILD` variables — GitHub build-context images

When a `build: <github-url>` directive is found, the build URL is recorded as a separate variable using the GitHub repo name and branch:

| Step | Example |
|---|---|
| Original build URL | `https://github.com/thehyve/platform-output-support.git#rm/data-loader-dockerized` |
| Repo name | `platform-output-support` |
| Last branch segment | `data-loader-dockerized` |
| Join, replace non-word chars with `_`, upper-case | `PLATFORM_OUTPUT_SUPPORT_DATA_LOADER_DOCKERIZED` |
| Append `_BUILD` | `PLATFORM_OUTPUT_SUPPORT_DATA_LOADER_DOCKERIZED_BUILD` |

Example env file with both types:

```env
# Found in: docker-compose.yml
PLATFORM_OUTPUT_SUPPORT_DATA_LOADER_DOCKERIZED_BUILD=https://github.com/thehyve/platform-output-support.git#rm/data-loader-dockerized

# Found in: docker-compose.yml → https://raw.githubusercontent.com/thehyve/platform-output-support/rm/data-loader-dockerized/Dockerfile
PYTHON_IMAGE=python:3.13-slim

# Found in: docker-compose.yml
REDIS_IMAGE=redis:7
```

The `vuln-scanning` workflow handles both types automatically:
- `*_IMAGE` → pulled and scanned with `trivy image`
- `*_BUILD` → built with `docker build <url>` then scanned with `trivy image`

---

## 🎫 Issue lifecycle

| Condition | Action |
|---|---|
| 🆕 Vulnerability found, no existing issue | New issue created with `HIGH` / `CRITICAL` label(s) and CVE list |
| 🔄 Vulnerability found, issue already exists | Comment added to the existing issue with updated scan timestamp and CVE list |
| 🔄 Vulnerability found, issue already exists | Assignee replaced with the current assignee, and a comment added with updated scan timestamp and CVE list |
| ✅ No HIGH or CRITICAL vulnerabilities for an image | No issue created or updated |

Issues are matched by title (`Vulnerability detected in <image>`) and the caller-supplied `tag` label, so scans from different projects do not interfere with each other.
Expand Down
Loading
Loading