Skip to content

feat: extract CI lifecycle and Prow job management skills from openshift/release#25

Merged
durandom merged 27 commits into
redhat-developer:mainfrom
zdrapela:extract-ci-skills
May 26, 2026
Merged

feat: extract CI lifecycle and Prow job management skills from openshift/release#25
durandom merged 27 commits into
redhat-developer:mainfrom
zdrapela:extract-ci-skills

Conversation

@zdrapela

Copy link
Copy Markdown
Member

Summary

  • Extract 10 skills from openshift/release .claude/skills/ into this repository
  • Rename with concern-first naming (lifecycle-*, prow-*) and deduplicate shared scripts
  • Add dual-mode support (local checkout or GitHub API via gh CLI) so skills work from any directory

New skills

Lifecycle (portable, pure API): lifecycle-ocp, lifecycle-aks, lifecycle-eks, lifecycle-gke

Prow job management (openshift/release): prow-ocp-jobs, prow-ocp-pools, prow-ocp-coverage, prow-aks-jobs, prow-eks-jobs, prow-gke-jobs

Shared infrastructure: _shared/resolve-repo.sh, _shared/fetch-yaml.sh, _shared/list-k8s-test-configs.sh

Key changes from originals

  • 3x list-k8s-test-configs.sh duplication eliminated (was identical in AKS/EKS/GKE)
  • 2x print-configured-versions.sh duplication eliminated (consolidated into fetch-yaml.sh)
  • ocp-lifecycle.jq cross-skill reference is now a clean sibling path
  • All read-only scripts auto-detect local openshift/release checkout or fall back to gh CLI
  • rhdh-decommission-release intentionally left in openshift/release (destructive, local-only)

…ift/release

Extract 10 skills from openshift/release .claude/skills/ into this
repository, renaming them with concern-first naming (lifecycle-*,
prow-*) and adding dual-mode support for local/remote repo access.

Lifecycle skills (portable, pure API):
- lifecycle-ocp: OCP version lifecycle via Red Hat Product Life Cycles API
- lifecycle-aks: AKS K8s version lifecycle via AKS release status API
- lifecycle-eks: EKS K8s version lifecycle via AWS docs
- lifecycle-gke: GKE K8s version lifecycle via endoflife.date API

Prow job management skills (openshift/release):
- prow-ocp-jobs: list/generate OCP cluster-claim test entries
- prow-ocp-pools: list/generate Hive ClusterPool YAML
- prow-ocp-coverage: cross-reference pools, jobs, and lifecycle data
- prow-aks-jobs: list AKS MAPT test entries
- prow-eks-jobs: list EKS MAPT test entries
- prow-gke-jobs: list GKE test entries

Shared infrastructure (skills/_shared/):
- resolve-repo.sh: auto-detect local checkout or GitHub API fallback
- fetch-yaml.sh: dual-mode file listing/fetching helpers
- list-k8s-test-configs.sh: deduplicated from 3 identical copies

Key improvements over originals:
- 3x list-k8s-test-configs.sh duplication eliminated
- 2x print-configured-versions.sh duplication eliminated
- ocp-lifecycle.jq cross-reference is now a clean sibling path
- All read-only scripts support dual-mode (local or gh CLI)
- rhdh-decommission-release intentionally left in openshift/release

Assisted-by: OpenCode
@zdrapela zdrapela force-pushed the extract-ci-skills branch from 9e46d99 to a386e38 Compare May 13, 2026 13:26
zdrapela added 9 commits May 13, 2026 15:56
The hook was tracked as 100644 (non-executable), causing git to skip it
with a warning on every commit. Fresh clones would never run the hook.

Assisted-by: OpenCode
…ycle

Add stdlib-only Python modules replacing resolve-repo.sh and
ocp-lifecycle.jq:
- resolve_repo.py: auto-detect local openshift/release checkout or
  remote mode
- ocp_lifecycle.py: classify OCP versions by lifecycle phase, replacing
  the jq filter with pure Python date/string processing

Assisted-by: OpenCode
PEP 723 scripts with ruamel.yaml for reading configured versions from
CI config YAML files. API calls use urllib (stdlib). Both import
shared fetch_yaml module for dual-mode repo access.

Assisted-by: OpenCode
PEP 723 shared script list_k8s_test_configs.py replaces 3 identical
copies of list-k8s-test-configs.sh. Thin wrappers for AKS/EKS/GKE
import and call the shared module with platform-specific patterns.

Assisted-by: OpenCode
PEP 723 scripts replacing list-ocp-test-configs.sh and
generate-test-entry.sh. Use ruamel.yaml for YAML parsing and
generation, shared fetch_yaml module for dual-mode access.

Assisted-by: OpenCode
PEP 723 scripts replacing list-cluster-pools.sh and
generate-cluster-pool.sh. Uses ruamel.yaml for round-trip YAML
processing (preserving comments/order in generated pool files).
generate_cluster_pool.py replaces the recursive grep with
pathlib.rglob for imageSetRef lookup.

Assisted-by: OpenCode
PEP 723 script replacing the 475-line bash analyze-coverage.sh.
Uses Python dicts instead of bash 4+ associative arrays, imports
shared ocp_lifecycle module instead of external jq filter, and
shared fetch_yaml module for dual-mode repo access.

Assisted-by: OpenCode
Replace bash invocations with python/uv run for the new .py scripts.
Update Prerequisites from yq/jq/curl to Python 3.9+.

Assisted-by: OpenCode
Delete all 16 bash (.sh) and jq (.jq) scripts that have been replaced
by Python equivalents in the previous commits.

Assisted-by: OpenCode
@zdrapela zdrapela force-pushed the extract-ci-skills branch from 324f75e to 09752d2 Compare May 13, 2026 14:15
zdrapela added 9 commits May 13, 2026 16:19
Replace \${SKILL_DIR}/scripts/ with scripts/ in all SKILL.md files
to align with the Agent Skills specification ("use relative paths
from the skill root") and the existing repo convention used by
rhdh-jira, overlay, and create-plugin skills.

Assisted-by: OpenCode
lifecycle-ocp and lifecycle-gke used python while all other scripts
used uv run. Standardize on uv run everywhere so agents don't need
to distinguish between invocation methods.

Assisted-by: OpenCode
…kills

Split lifecycle-ocp into two focused skills:
- lifecycle-rhdh: RHDH release lifecycle (GA dates, support phases,
  OCP compatibility per release)
- lifecycle-ocp: OCP version lifecycle only (support phases, EUS)

Extract shared RHDH lifecycle parsing into _shared/rhdh_lifecycle.py
to eliminate duplication between lifecycle-rhdh, lifecycle-ocp, and
prow-ocp-coverage.

Assisted-by: OpenCode
…cycle.py

Extract shared API client into redhat_lifecycle.py with product alias
support (rhbk, quay, rhdh, ocp, etc.) and consistent return shape.
Refactor rhdh_lifecycle.py and ocp_lifecycle.py as thin wrappers to
preserve backward compatibility with existing consumers.

Assisted-by: OpenCode
…e checks

Generic skill that queries the Red Hat Product Life Cycles API for any
product (RHBK, Quay, RHDH, OCP, etc.) using aliases or full names.
Supports --group-major for RHBK major version summaries, --active-only,
--json, and --list-products.

Assisted-by: OpenCode
…oviders

Aggregates PostgreSQL version EOL dates from endoflife.date for three
providers: upstream PostgreSQL, Amazon RDS, and Azure Database.
Shows side-by-side comparison of support dates per major version.

Assisted-by: OpenCode
Both scripts now support --json for structured output, matching the
existing --json support in GKE and OCP lifecycle scripts. Enables
programmatic consumption by other skills like rhdh-test-plan-review.

Assisted-by: OpenCode
…ow packages

Reorganize flat modules into two proper Python packages under _shared/:

- rhdh_lifecycle/ (lifecycle data from external APIs):
  redhat.py, rhdh.py, ocp.py, pg.py
- rhdh_prow/ (openshift/release repo access):
  repo.py, yaml.py, k8s_configs.py

The two packages have zero cross-dependencies, matching the natural
boundary between lifecycle queries and CI config management.

All 15 consumer scripts updated with new import paths.

Assisted-by: OpenCode
Move pg.py from _shared/rhdh_lifecycle/ into lifecycle-pg/scripts/
since it has exactly one consumer and zero dependencies on other
shared modules. Remove unused repo_path() from rhdh_prow/repo.py.

Assisted-by: OpenCode
@zdrapela zdrapela marked this pull request as ready for review May 13, 2026 16:00
Comment thread skills/_shared/rhdh_lifecycle/redhat.py Outdated
Comment thread skills/_shared/rhdh_lifecycle/ocp.py Outdated
Comment thread skills/_shared/rhdh_prow/repo.py Outdated
Comment thread skills/lifecycle/scripts/rhdh_lifecycle/rhdh.py
Comment thread skills/lifecycle/scripts/check_aks_lifecycle.py Outdated
Comment thread skills/lifecycle/scripts/check_aks_lifecycle.py
Comment thread skills/lifecycle/scripts/rhdh_lifecycle/pg.py Outdated
Comment thread skills/lifecycle/scripts/check_aks_lifecycle.py Outdated
Comment thread skills/_shared/rhdh_prow/yaml.py Outdated
Comment thread skills/_shared/rhdh_lifecycle/rhdh.py Outdated
zdrapela and others added 5 commits May 15, 2026 18:12
- fetch_api returns None instead of sys.exit(1) (library code should
  not kill the caller's process). Callers now check for None.
- Rename _is_date/_to_date/_ver_sort_key to public (is_date/to_date/
  ver_sort_key) since they are imported across module boundaries.
- Replace __import__('sys').stderr with proper import sys.
- Extract _enrich_rhdh_version helper to deduplicate enrichment loop.
- Remove shebangs and PEP 723 blocks from library-only modules that
  have no __main__ block.

Assisted-by: OpenCode
Replace 14 fine-grained lifecycle-* and prow-* skills with 2
consolidated skills using the router + workflow pattern (matching
overlay and rhdh-local conventions):

- lifecycle/ skill: router SKILL.md + 7 workflow files + all lifecycle
  scripts. rhdh_lifecycle package is self-contained under scripts/.
- prow/ skill: router SKILL.md + 5 workflow files + all prow scripts.
  rhdh_prow package is self-contained under scripts/.

Key changes:
- Eliminates _shared/ directory entirely. Each skill is fully
  self-contained per the Agent Skills spec.
- analyze_coverage.py breaks the cross-skill dependency by calling
  lifecycle scripts via subprocess --json (with direct API fallback).
- AKS/EKS configured version display inlined into lifecycle skill
  via configured_versions.py (no dependency on rhdh_prow).
- 14 symlinks reduce to 2.

Assisted-by: OpenCode
Python automatically adds the script's directory to sys.path[0]
when running a file. Since rhdh_lifecycle/ and rhdh_prow/ packages
are siblings of the scripts in the same directory, imports resolve
naturally without sys.path manipulation. Also removes unused Path
imports that were only needed for the sys.path line.

Aligns with existing repo conventions -- no other skill scripts
(overlay, rhdh-jira, create-plugin, rhdh-local) use sys.path.insert.

Assisted-by: OpenCode
- Add missing Action section to check-pg.md workflow
- Expand routing patterns in lifecycle SKILL.md (OpenShift EOL/support,
  PG alias, Red Hat Build of Keycloak full name)
- Expand routing patterns in prow SKILL.md (add OCP version, clean up
  old release)

Assisted-by: OpenCode
Move repo resolution, YAML I/O, configured-versions, and pg-lifecycle
into the rhdh_lifecycle package. Prow accesses them via a symlink and
thin re-export facades in rhdh_prow, requiring zero import changes in
prow consumer scripts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zdrapela

zdrapela commented May 18, 2026

Copy link
Copy Markdown
Member Author

Thank you @gustavolira for the reveiw! I incorporated you suggestions. I'm not using Python that often and I appreciate them.

I then thought about the skill structure and it looks like a better idea to move them into two bigger skills with dedicated workflows.

@durandom durandom left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Well-structured PR — the consolidated 2-skill architecture with router + workflow pattern is clean, the dependency graph is acyclic, and the subprocess boundary between prow and lifecycle data is the right call. Two bugs to fix before merge.


🐛 Bug: check_aks_lifecycle.py — JSON output corruption

When both --test-pattern and --json are passed, print_configured_versions() writes human-readable text to stdout before the JSON blob, corrupting the output. The EKS script correctly guards this:

# check_eks_lifecycle.py (correct)
if args.test_pattern and not args.json_output:
    print_configured_versions(...)

The AKS script is missing the not args.json_output guard. This matters because analyze_coverage.py could call this script with --json.

Fix: Add and not args.json_output to the if args.test_pattern: condition in check_aks_lifecycle.py.


🔒 Security: redhat.py:fetch_api — Insufficient URL encoding

product_name.replace(' ', '+') is not proper URL encoding. Characters like &, =, #, ? pass through unescaped into the query string. Currently mitigated by PRODUCT_ALIASES, but resolve_product_name falls through to the raw user string for unknown aliases — so check_lifecycle.py --product "foo&bar=baz" would inject into the URL.

Fix: Replace with urllib.parse.quote_plus(product_name).


💡 Suggestion: Replace symlink with copied modules (Windows compatibility)

The symlink prow/scripts/rhdh_lifecycle → ../../lifecycle/scripts/rhdh_lifecycle breaks silently on Windows with core.symlinks = false (the symlink is checked out as a plain text file containing the path, and Python imports fail without error).

Only 2 facade files in prow actually touch the symlink (rhdh_prow/repo.py and rhdh_prow/yaml.py), re-exporting ~180 lines of infrastructure code. The lifecycle data functions are already accessed via subprocess --json in analyze_coverage.py.

Proposed alternative: copy repo.py (63 lines) and yaml.py (116 lines) directly into rhdh_prow/ as standalone modules. Delete the symlink. Each skill becomes truly self-contained, no cross-platform hazard, no sys.path manipulation needed.


Minor (non-blocking, good for follow-up)

  • ocp.py: api_data.get("data", [{}])[0] will IndexError if data is an empty list []. Guard like redhat.py does.
  • generate_test_entry.py: --reference format is not validated (only --version is). Malformed input crashes at split(".").
  • list_aks_jobs.py / list_eks_jobs.py / list_gke_jobs.py: sys.argv mutation is fragile — use main(["--pattern", ...] + sys.argv[1:]) instead.
  • check-redhat.md: Missing Output and Action sections. check-ocp.md and check-rhdh.md also lack Action sections.
  • Unresolved from prior review: duplicated fetch_json boilerplate (3 copies), copy-pasted EOL cross-verification block (AKS↔EKS), inline version sort lambdas vs. existing ver_sort_key.

zdrapela added 3 commits May 20, 2026 15:53
- Fix AKS JSON output corruption: add missing `not args.json_output`
  guard to `print_configured_versions` call (matching EKS script)
- Fix URL encoding in redhat.py: use `urllib.parse.quote_plus` instead
  of naive `.replace(' ', '+')` to prevent query string injection
- Replace rhdh_lifecycle symlink with copied modules for Windows
  compatibility; add AGENTS.md sync rule for the shared files
- Guard ocp.py against empty API `data` list (prevent IndexError)
- Validate `--reference` format in generate_test_entry.py (X.Y)
- Fix sys.argv mutation in list_aks/eks/gke_jobs.py: pass explicit
  argv to main() instead of modifying global state
- Add missing Output/Action sections to check-redhat.md, check-ocp.md,
  and check-rhdh.md workflow docs

Assisted-by: OpenCode
- Extract shared fetch_json into rhdh_lifecycle.redhat; remove local
  copies from check_aks_lifecycle.py, check_eks_lifecycle.py,
  check_gke_lifecycle.py, and pg.py
- Extract filter_supported_eol_entries helper to deduplicate the
  identical EOL cross-verification blocks in AKS and EKS scripts
- Replace all inline version sort lambdas with ver_sort_key from
  rhdh_lifecycle.redhat (lifecycle scripts) and rhdh_prow.__init__
  (prow scripts); add AGENTS.md sync rule for the prow copy

Assisted-by: OpenCode
Move fetch_json, ver_sort_key, is_date, to_date, and
filter_supported_eol_entries from redhat.py into a new
rhdh_lifecycle/utils.py module. These functions are used by
AKS/EKS/GKE/PG scripts that have no relation to the Red Hat
Product Life Cycles API.

Create rhdh_prow/utils.py with ver_sort_key (subset of lifecycle
utils). Restore rhdh_prow/__init__.py to docstring-only.

Update all imports across lifecycle and prow scripts. redhat.py
re-exports is_date, to_date, ver_sort_key for backward
compatibility.

Update AGENTS.md shared modules sync rule to reference utils.py.

Assisted-by: OpenCode
@zdrapela

Copy link
Copy Markdown
Member Author

Hi @durandom, I addressed your review and also the remaining from @gustavolira.

About 💡 Suggestion: Replace symlink with copied modules (Windows compatibility), I'm only worried about those files would go out of sync, like fixes or improvements in one wouldn't be updated in the second one. Would a line in AGENTS.md to keep them in sync do the trick?

@durandom durandom left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All previous feedback thoroughly addressed — nice work on the consolidation.

4 non-blocking items for follow-up:

  1. analyze_coverage.py fallback _fetch_api still uses .replace(' ', '+') — inconsistent with the quote_plus fix in redhat.py
  2. analyze_coverage.py fallback parsers use [{}][0] which will IndexError on empty data: [] — same guard as ocp.py/redhat.py should be applied
  3. generate_cluster_pool.py is missing --reference format validation (was fixed in generate_test_entry.py but not here)
  4. ruamel.yaml via PEP 723 inline metadata technically diverges from the stdlib-only rule — might warrant an ADR update or documented exception

@durandom durandom merged commit 4795808 into redhat-developer:main May 26, 2026
4 checks passed
durandom pushed a commit that referenced this pull request Jun 5, 2026
Replace .replace(' ', '+') with urllib.parse.quote_plus() for proper
URL encoding of product names in the fallback _fetch_api function.

Guard api_data.get('data', [{}])[0] against empty data arrays to
prevent IndexError when the API returns no results.

Addresses non-blocking review items from PR #25.

Assisted-by: OpenCode
durandom pushed a commit that referenced this pull request Jun 5, 2026
Validate that --reference matches X.Y format before splitting, matching
the existing validation in generate_test_entry.py. Previously, a
malformed --reference value would crash at split('.').

Addresses non-blocking review item from PR #25.

Assisted-by: OpenCode
durandom pushed a commit that referenced this pull request Jun 5, 2026
Document that prow skill scripts use ruamel.yaml via PEP 723 inline
script metadata for round-trip YAML fidelity. The dependency is managed
by uv run --script with no user-facing install step.

Addresses non-blocking review item from PR #25.

Assisted-by: OpenCode
durandom pushed a commit that referenced this pull request Jun 5, 2026
Add Platform Lifecycle and CI / Prow sections covering the three skills
added via PRs #25 and #26 that were missing from the README.

Assisted-by: OpenCode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants