feat(telco-kpis): Add Gitea role for test report publishing by ccardenosa · Pull Request #451 · openshift-kni/eco-ci-cd

ccardenosa · 2026-05-06T14:04:03Z

Implement Gitea deployment and report publishing infrastructure for hosting Telco-KPIs test reports with vault integration and retention policies.

Problem

Telco-KPIs test reports need centralized hosting accessible to test engineers and stakeholders. Reports generated on bastion hosts need automated publishing to a Git-based repository system with retention management.

Solution

Created gitea Ansible role for deploying Gitea server and publishing test reports as Markdown files with compressed artifacts.

Role: playbooks/telco-kpis/roles/gitea/

Features

Deployment:

Podman-based Gitea server deployment on bastion
SQLite database with automatic migration handling
Firewall configuration (port 3000)
Accessibility checking instead of container existence
Admin user creation with API token management

Report Publishing:

Creates organization and repositories automatically
Publishes Markdown reports via Gitea API
Uploads compressed tarball as release artifact
Updates repository README with latest report links
Retention policy: keeps last 15 reports, removes older ones

Vault Integration:

Gitea credentials stored in Ansible vault
Secure API token management
Credential validation before operations

Implementation Details

Role Structure:

tasks/main.yml - Entry point with operation dispatch
tasks/deploy.yml - Gitea server deployment
tasks/initialize.yml - Initial configuration and admin setup
tasks/validate-credentials.yml - Vault credential validation
tasks/create-repository.yml - Repository creation
tasks/publish-report.yml - Report publishing workflow
defaults/main.yml - Default variables
templates/README.md.j2 - Repository README template

Task Operations:

gitea_operation: deploy - Deploy and initialize Gitea server
gitea_operation: publish - Publish test report
gitea_operation: validate - Validate vault credentials

Usage

Deploy Gitea:

- name: Deploy Gitea server
  ansible.builtin.include_role:
    name: gitea
  vars:
    gitea_operation: deploy
    gitea_vault_org: telco-kpis

Publish report:

- name: Publish test report
  ansible.builtin.include_role:
    name: gitea
  vars:
    gitea_operation: publish
    gitea_vault_org: telco-kpis
    gitea_vault_repo: hlxcl7-reports
    gitea_report_file: /path/to/report.md
    gitea_artifact_file: /path/to/artifacts.tar.gz

Key Features

Firewall Management:

Detects firewalld vs. iptables
Adds port 3000 rule if not present
Handles both firewall backends

Database Migration:

Waits for database initialization on first run
Handles migration errors gracefully
Retries admin user creation after migration

Repository Retention:

Keeps last 15 reports per repository
Automatically deletes older reports
Prevents unbounded repository growth

Error Handling:

Comprehensive API error checking
Retries for transient failures
Detailed error messages

Benefits

Centralized test report hosting
Automated report publishing workflow
Secure credential management via vault
Retention policy prevents storage bloat
Accessible web UI for stakeholders
Git-based versioning of reports

Integration

Used by playbooks/telco-kpis/generate-report.yml to publish aggregated test reports from all Telco-KPIs tests (node-info, BIOS validation, performance tests, deployment timeline).

Related: Telco-KPIs test infrastructure, report generation system

openshift-ci · 2026-05-06T14:04:13Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign shaior for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

eifrach · 2026-05-06T14:13:01Z

+- name: Check if Gitea is accessible via localhost
+  ansible.builtin.uri:
+    url: "http://localhost:{{ gitea_http_port }}/"
+    method: GET
+    status_code: 200
+    validate_certs: false
+  register: gitea_accessible
+  failed_when: false
+  changed_when: false


I think for this it's better to add a small retry incase there is some network latency or issue

eifrach · 2026-05-06T14:15:40Z

+            podman run -d
+            --name {{ gitea_container_name }}


Suggested change

podman run -d

--name {{ gitea_container_name }}

podman run -d --rm

--name {{ gitea_container_name }}

Implement role-based container image mirroring system for internal registry management, supporting both mirror and removal operations with authentication. ## Problem Telco-KPIs testing requires mirroring container images to internal registries for disconnected environments and test image management. Previous approach used inline playbook tasks without reusability. ## Solution Created dedicated `container_image_mirror` Ansible role with playbooks for both mirroring and removal operations: **Role: playbooks/roles/container_image_mirror/** - Supports 'mirror' and 'remove' operations via parameter - Uses skopeo for image operations - Handles authentication with pull secrets - Continues operation even if some images fail - Comprehensive success/failure reporting with summary **Playbooks:** - `playbooks/mirror-images.yml` - Mirror images to internal registry - `playbooks/remove-images.yml` - Remove images from registry storage ## Features **Authentication:** - Pull secret support for private registries (via pull_secret_string or pull_secret_path) - System default auth when no pull secret provided - Configurable auth file location (/tmp for bastion compatibility) - use_pull_secret flag to control authentication method **Registry Configuration:** - Configurable registry host/port/namespace - TLS verification control - Source and destination registry support **Operations:** - Idempotent with existence checks - Detailed mirror/removal summary - Error handling continues operation on failures ## Usage **Mirror images:** ```bash ansible-playbook playbooks/mirror-images.yml \ -e images='[{"source": "quay.io/image:tag", "dest": "registry.local/namespace/image:tag"}]' \ -e registry_host=registry.local \ -e pull_secret_string='{"auths": {...}}' ``` **Remove images:** ```bash ansible-playbook playbooks/remove-images.yml \ -e images='[{"dest": "registry.local/namespace/image:tag"}]' \ -e registry_host=registry.local ``` ## Implementation Details **Role Structure:** - `defaults/main.yaml` - Default variables - `tasks/main.yaml` - Entry point with operation dispatch - `tasks/mirror.yaml` - Mirror images using skopeo - `tasks/remove.yaml` - Remove images from registry storage - `meta/main.yaml` - Role metadata - `README.md` - Comprehensive documentation **Key Variables:** - `container_image_mirror_operation`: "mirror" or "remove" - `container_image_mirror_images`: List of image objects - `container_image_mirror_registry_host`: Target registry hostname - `container_image_mirror_pull_secret_string`: JSON pull secret - `container_image_mirror_use_pull_secret`: Enable/disable authentication ## Benefits - Reusable role for both mirror and removal operations - Cleaner separation of concerns - Easier to test and maintain - Follows eco-ci-cd role patterns (like ocp_operator_mirror) - Well-documented with examples - Jenkins job compatible (uses same variable names) ## Jenkins Integration Used by `telco-kpis-mirror-ran-test-images` Jenkins job for mirroring RAN test images to internal registries. Related: Telco-KPIs test infrastructure Signed-off-by: Carlos Cardenosa <ccardeno@redhat.com>

Add parse-lockdown.yml playbook that extracts deployment parameters from lockdown JSON files, enabling decoupled parameter management for reproducible deployments. ## Problem Telco-KPIs testing requires exact software versions (OCP releases, operator channels, catalogs) for reproducible deployments. Parsing lockdown JSON inline within deployment playbooks creates tight coupling and makes parameter reuse across multiple jobs difficult. ## Solution Implement standalone parser playbook that runs before deployment jobs: 1. Downloads and parses lockdown JSON from URI 2. Auto-detects lockdown type (hub vs spoke) from JSON structure 3. Extracts deployment parameters 4. Outputs in shell env and JSON formats for downstream consumption ## Changes **New playbook: playbooks/telco-kpis/parse-lockdown.yml** - Auto-detection logic: checks for 'hub' vs 'deployment' key in JSON - Hub parsing: extracts OCP_RELEASE_IMAGE, ACM_CHANNEL, MCE_CHANNEL, catalogs - Spoke parsing: extracts OCP_PULL_SPEC, ZTP_PULL_SPEC, operator configurations - SSL certificate bypass for internal GitLab instances (validate_certs: false) - Dynamic artifact naming using lockdown filename from URI - Outputs three artifacts per run: - `{lockdown-name}.json`: Original lockdown file - `{lockdown-name}-params.env`: Shell environment variables - `{lockdown-name}-params.json`: Structured JSON parameters **New role: playbooks/telco-kpis/roles/lockdown_hub_config/** - tasks/main.yml: Download, validate, and parse hub lockdown JSON - defaults/main.yml: Default configuration values - README.md: Comprehensive role documentation - Used by both parse-lockdown.yml and deploy-ocp-operators.yml ## Usage Workflow **Step 1: Parse lockdown file** ```bash ansible-playbook playbooks/telco-kpis/parse-lockdown.yml \ -e lockdown_uri=https://gitlab.cee.redhat.com/.../lockdown-hub-x86_64.json ``` **Step 2: Use extracted parameters** ```bash # Source env file source lockdown-hub-x86_64-params.env # Use in deployment ansible-playbook playbooks/deploy-ocp-sno.yml \ -e release="${OCP_RELEASE_IMAGE}" ``` ## Benefits **Decoupling:** - Parsing separated from deployment logic - Parameters extracted once, reused across multiple jobs - Easier debugging with explicit parameter artifacts **Flexibility:** - Supports multiple lockdown formats (hub, spoke, baseline) - Self-documenting artifacts with actual lockdown names - Both shell and JSON output formats **Prow-ready:** - Clean separation aligns with Prow step registry architecture - Parser step can run independently, output shared via SHARED_DIR ## Key Features **Auto-detection:** ```yaml lockdown_type: "{{ 'hub' if ('hub' in lockdown_data) else 'spoke' }}" ``` **Dynamic artifact naming:** ```yaml lockdown_filename: "{{ lockdown_uri | regex_replace('.*/', '') | regex_replace('.json$', '') }}" # Result: lockdown-hub-x86_64-params.env (not generic lockdown-params.env) ``` **Hub channel transformations:** ```yaml hub_acm_channel: "release-{{ lockdown_data.hub.acm.version_override }}" hub_mce_channel: "{{ lockdown_data.hub.acm.mce_override | regex_replace('^v', 'stable-') | regex_replace('\\.\\d+$', '') }}" ``` ## Example Artifacts **lockdown-hub-x86_64-params.env:** ```bash LOCKDOWN_TYPE=hub OCP_RELEASE_IMAGE=quay.io/openshift-release-dev/ocp-release:4.20.4-x86_64 OCP_VERSION=4.20 ACM_CHANNEL=release-2.13 MCE_CHANNEL=stable-2.8 TALM_CATALOG=quay.io/.../talm-index:v4.20 GITOPS_CATALOG=quay.io/.../gitops-index:v1.15 ``` **lockdown-hub-x86_64-params.json:** ```json { "LOCKDOWN_TYPE": "hub", "OCP_RELEASE_IMAGE": "quay.io/openshift-release-dev/ocp-release:4.20.4-x86_64", "OCP_VERSION": "4.20", "ACM_CHANNEL": "release-2.13", "MCE_CHANNEL": "stable-2.8", "TALM_CATALOG": "quay.io/.../talm-index:v4.20", "GITOPS_CATALOG": "quay.io/.../gitops-index:v1.15" } ``` ## Verification Tested with both lockdown types: - Hub lockdown: Successfully extracted OCP 4.20.4 pull spec and operator channels - Spoke lockdown: Successfully extracted spoke deployment parameters Artifacts correctly named with lockdown filename and contain expected parameters. Related: Telco-KPIs reproducible deployment system Signed-off-by: Carlos Cardenosa <ccardeno@redhat.com>

Add comprehensive troubleshooting guides for common Telco-KPIs deployment and testing issues. ## Added Documentation **playbooks/telco-kpis/docs/troubleshooting/** 1. **prometheus-pod-stuck-reboot-test-blocker.md** - Issue: Prometheus pod stuck in Init:0/1 causing reboot tests to skip - Root cause: Corrupted alertmanager-main-generated ConfigMap (OCPBUGS-65953, OCPBUGS-70352) - Impact: CNF-gotests BeforeEach health check fails - Workaround: Fix ConfigMap data and restart pod - Prevention: Automated fix option for post-deployment playbooks 2. **k8s-exec-ipv6-fallback-issue.md** - Issue: kubernetes.core.k8s_exec fails with "No route to host" in dual-stack environments - Root cause: Python websocket-client library doesn't fall back to IPv4 - Impact: Blocks pod exec operations (BIOS/microcode collection, hardware info) - Solution: Use `oc exec` via ansible.builtin.shell instead - Verification: Tested on spree-02 cluster (2026-04-30) ## Benefits - Faster troubleshooting with documented solutions - Reduces repeated investigation of known issues - Provides context (bug IDs, verification dates) for future reference - Includes both workarounds and permanent solutions Related: Telco-KPIs test infrastructure reliability

Implement Gitea deployment and report publishing infrastructure for hosting Telco-KPIs test reports with vault integration and retention policies. ## Problem Telco-KPIs test reports need centralized hosting accessible to test engineers and stakeholders. Reports generated on bastion hosts need automated publishing to a Git-based repository system with retention management. ## Solution Created `gitea` Ansible role for deploying Gitea server and publishing test reports as Markdown files with compressed artifacts. **Role: playbooks/telco-kpis/roles/gitea/** ## Features **Deployment:** - Podman-based Gitea server deployment on bastion - SQLite database with automatic migration handling - Firewall configuration (port 3000) - Accessibility checking instead of container existence - Admin user creation with API token management **Report Publishing:** - Creates organization and repositories automatically - Publishes Markdown reports via Gitea API - Uploads compressed tarball as release artifact - Updates repository README with latest report links - Retention policy: keeps last 15 reports, removes older ones **Vault Integration:** - Gitea credentials stored in Ansible vault - Secure API token management - Credential validation before operations ## Implementation Details **Role Structure:** - `tasks/main.yml` - Entry point with operation dispatch - `tasks/deploy.yml` - Gitea server deployment - `tasks/initialize.yml` - Initial configuration and admin setup - `tasks/validate-credentials.yml` - Vault credential validation - `tasks/create-repository.yml` - Repository creation - `tasks/publish-report.yml` - Report publishing workflow - `defaults/main.yml` - Default variables - `templates/README.md.j2` - Repository README template **Task Operations:** - `gitea_operation: deploy` - Deploy and initialize Gitea server - `gitea_operation: publish` - Publish test report - `gitea_operation: validate` - Validate vault credentials ## Usage **Deploy Gitea:** ```yaml - name: Deploy Gitea server ansible.builtin.include_role: name: gitea vars: gitea_operation: deploy gitea_vault_org: telco-kpis ``` **Publish report:** ```yaml - name: Publish test report ansible.builtin.include_role: name: gitea vars: gitea_operation: publish gitea_vault_org: telco-kpis gitea_vault_repo: hlxcl7-reports gitea_report_file: /path/to/report.md gitea_artifact_file: /path/to/artifacts.tar.gz ``` ## Key Features **Firewall Management:** - Detects firewalld vs. iptables - Adds port 3000 rule if not present - Handles both firewall backends **Database Migration:** - Waits for database initialization on first run - Handles migration errors gracefully - Retries admin user creation after migration **Repository Retention:** - Keeps last 15 reports per repository - Automatically deletes older reports - Prevents unbounded repository growth **Error Handling:** - Comprehensive API error checking - Retries for transient failures - Detailed error messages ## Benefits - Centralized test report hosting - Automated report publishing workflow - Secure credential management via vault - Retention policy prevents storage bloat - Accessible web UI for stakeholders - Git-based versioning of reports ## Integration Used by `playbooks/telco-kpis/generate-report.yml` to publish aggregated test reports from all Telco-KPIs tests (node-info, BIOS validation, performance tests, deployment timeline). Related: Telco-KPIs test infrastructure, report generation system Signed-off-by: Carlos Cardenosa <ccardeno@redhat.com>

openshift-ci Bot requested review from cplacani and rdiscala May 6, 2026 14:04

$eifrach$

eifrach reviewed May 6, 2026

View reviewed changes

ccardenosa force-pushed the refactor/ipa-telco-kpis-prow-migration-gitea-deployment branch from 60e08de to 1d48ee0 Compare May 6, 2026 14:16

ccardenosa added 4 commits May 6, 2026 20:20

ccardenosa force-pushed the refactor/ipa-telco-kpis-prow-migration-gitea-deployment branch from 1d48ee0 to 9a4b561 Compare May 6, 2026 18:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(telco-kpis): Add Gitea role for test report publishing#451

feat(telco-kpis): Add Gitea role for test report publishing#451
ccardenosa wants to merge 4 commits intoopenshift-kni:mainfrom
ccardenosa:refactor/ipa-telco-kpis-prow-migration-gitea-deployment

ccardenosa commented May 6, 2026

Uh oh!

openshift-ci Bot commented May 6, 2026

Uh oh!

$@eifrach$ eifrach May 6, 2026

Uh oh!

$@eifrach$ eifrach May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ccardenosa commented May 6, 2026

Problem

Solution

Features

Implementation Details

Usage

Key Features

Benefits

Integration

Uh oh!

openshift-ci Bot commented May 6, 2026

Uh oh!

eifrach May 6, 2026

Choose a reason for hiding this comment

Uh oh!

eifrach May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

$@eifrach$ eifrach May 6, 2026

$@eifrach$ eifrach May 6, 2026