Skip to content

MGMT-23874: fix plugin IDMS/ITMS overwrite during mirror deploy#305

Closed
danielerez wants to merge 1 commit intorh-ecosystem-edge:mainfrom
danielerez:mgmt-23874-plugin-mirror-merge
Closed

MGMT-23874: fix plugin IDMS/ITMS overwrite during mirror deploy#305
danielerez wants to merge 1 commit intorh-ecosystem-edge:mainfrom
danielerez:mgmt-23874-plugin-mirror-merge

Conversation

@danielerez
Copy link
Copy Markdown
Contributor

@danielerez danielerez commented Apr 28, 2026

deploy_plugin.yaml previously overwrote internal IDMS/ITMS during each plugin deployment.
Because oc-mirror generates IDMS/ITMS with the same resource name for every plugin,
the latest plugin run replaced mirror mappings from earlier runs.
This change keeps plugin mirror entries merged so mappings accumulate across plugins.

I.e. This change isolates plugin mirror resources under a dedicated "-plugins" suffix
and introduces a merge helper that combines mirror entries by source before apply.
The result is an additive behavior: new plugin mirrors are appended without clobbering
existing IDMS/ITMS mappings, while non-plugin flows keep using the original
internal resource path unchanged.

Example resulting resources:
$ oc get idms
NAME
idms-operator-0
idms-operator-0-internal
idms-operator-0-plugins

Summary by CodeRabbit

  • New Features

    • CLI to merge oc-mirror-generated mirror entries into cluster mirror resources, preserving existing sources and appending missing entries.
    • Optionally merge plugin-provided mirror sets into consolidated manifests used during deployment.
    • Configurable destination naming/suffixing to avoid mirror name conflicts and select merged vs non-merged manifests.
  • Chores

    • Added deployment tasks to conditionally run merges and apply merged manifests with retry handling.

@github-actions github-actions Bot added operators Operator installation/config plugins labels Apr 28, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

Warning

Rate limit exceeded

@danielerez has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 49 minutes and 29 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 21a20b28-7afd-417c-9242-541091537aa1

📥 Commits

Reviewing files that changed from the base of the PR and between 79d94c0 and 11bb8d9.

📒 Files selected for processing (5)
  • helpers/mirror/merge_mirror_sets.py
  • operators/quay-operator/quay_disconnected_mirrors.yaml
  • playbooks/tasks/deploy_plugin.yaml
  • playbooks/tasks/merge_mirror_manifests.yaml
  • playbooks/tasks/mirror_plugin.yaml

Walkthrough

Adds a new CLI tool that merges mirror entries from rendered oc-mirror manifests into live cluster ImageDigestMirrorSet/ImageTagMirrorSet resources, and integrates that merge step into Ansible playbooks so plugin-provided mirror manifests can be merged into landing-zone outputs before application.

Changes

Cohort / File(s) Summary
Mirror merge CLI
helpers/mirror/merge_mirror_sets.py
New Python CLI that renders input manifests via oc create --dry-run=client -o json, fetches live resources with oc get <resource> <name> -o json, merges mirror entries by source (merge_by_source), writes merged List JSON to output, and supports --apply with retries. Adds run_json(), merge_by_source(), parse_args(), and main() entrypoints.
Ansible: merge task
playbooks/tasks/merge_mirror_manifests.yaml
New task that loops merge_mirror_items and invokes the CLI per item, passing oc path, input/output manifests, api kind/resource, spec key, KUBECONFIG, and retry controls; runs only when merge_mirror_enabled and item.source_exists.
Ansible: plugin workflow
playbooks/tasks/mirror_plugin.yaml, playbooks/tasks/deploy_plugin.yaml
mirror_plugin.yaml refactored to compute landing-zone paths, conditionally call the merge task when landing-zone IDMS/ITMS manifests exist, then apply merged manifests. deploy_plugin.yaml now supplies quay_merge_plugin_mirrors: true to the plugin include.
Quay operator manifest generation
operators/quay-operator/quay_disconnected_mirrors.yaml
Generation now computes internal/plugin suffixes and destination filenames from quay_merge_plugin_mirrors, copies oc-mirror workspace files into computed destinations, renames IDMS/ITMS resources to avoid conflicts, and optionally emits merged *-merged.json artifacts for application.

Sequence Diagram(s)

sequenceDiagram
    participant Ansible as Ansible Playbook
    participant MergeTask as merge_mirror_manifests.yaml
    participant MergeScript as merge_mirror_sets.py
    participant OcCLI as oc CLI
    participant Cluster as Kubernetes Cluster

    Ansible->>MergeTask: iterate merge_mirror_items / call merge task
    MergeTask->>MergeScript: run_json(input_manifest, api_kind, spec_key, output_manifest, --apply?)
    MergeScript->>OcCLI: oc create --dry-run=client -f input_manifest -o json
    OcCLI-->>MergeScript: rendered manifest JSON
    Loop for matching docs
        MergeScript->>OcCLI: oc get <resource> <name> -o json
        OcCLI->>Cluster: fetch live resource
        Cluster-->>OcCLI: live resource JSON / NotFound
        OcCLI-->>MergeScript: existing spec JSON / error
        MergeScript->>MergeScript: merge_by_source(existing.spec[spec_key], rendered.spec[spec_key])
    end
    MergeScript-->>MergeTask: write merged List JSON to output_manifest
    alt --apply set
        MergeTask->>OcCLI: oc apply -f output_manifest
        OcCLI->>Cluster: apply manifests
        Cluster-->>OcCLI: apply responses
        MergeScript->>OcCLI: oc get ... verify merged state (with retries)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately identifies the main fix: preventing plugin IDMS/ITMS resource overwrites by adding merging logic and resource name suffixes to isolate plugin mirrors.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@helpers/mirror/merge_mirror_sets.py`:
- Around line 100-107: The current logic treats any non-zero return from
subprocess.run([... args.oc_bin, "get", args.oc_resource, name ...]) as "not
found" and falls back to appending doc to merged_docs; change this so only a
real not-found error is treated as missing: inspect current.returncode and
current.stderr (or stdout) and if the oc error message indicates "NotFound"
(e.g. "Error from server (NotFound)" or "not found") then append doc and
continue, but for any other error from subprocess.run raise/exit (or log and
abort) so you don't downgrade RBAC/API/transient failures into an unmerged
apply; keep references to the same symbols current, subprocess.run, args.oc_bin,
args.oc_resource, name, merged_docs and doc when implementing the check.

In `@playbooks/tasks/deploy_plugin.yaml`:
- Around line 401-402: The earlier landing-zone apply that runs oc apply -f
.../cluster-resources/ is still overwriting prior plugin mirror mappings; update
that task to follow the same isolation/merge behavior as the Quay Enterprise
include by honoring the quay_merge_plugin_mirrors flag: either change the apply
target to the plugin-specific suffixed oc-mirror output (the per-plugin
cluster-resources path) or switch the apply to merge mode when
quay_merge_plugin_mirrors is true so it won’t replace existing mirror mappings;
ensure the task that performs the initial oc apply uses the same
conditional/merge logic as the later block that depends on
quay_merge_plugin_mirrors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 999482e7-ab53-4283-8b0b-5d88485e9d59

📥 Commits

Reviewing files that changed from the base of the PR and between 45ae157 and ef1d2b4.

📒 Files selected for processing (3)
  • helpers/mirror/merge_mirror_sets.py
  • operators/quay-operator/quay_disconnected_mirrors.yaml
  • playbooks/tasks/deploy_plugin.yaml

Comment thread helpers/mirror/merge_mirror_sets.py Outdated
Comment thread playbooks/tasks/deploy_plugin.yaml
@danielerez danielerez force-pushed the mgmt-23874-plugin-mirror-merge branch from ef1d2b4 to 73e364c Compare April 28, 2026 08:31
@maorfr
Copy link
Copy Markdown
Collaborator

maorfr commented Apr 28, 2026

can we consider an approach that every plugin uses a custom catalog? that would remove the concern of overwriting completely. made the same suggestion in #288 (comment)

as for naming, i've created #306 which may help, if we adopt the above suggestion.

@danielerez danielerez force-pushed the mgmt-23874-plugin-mirror-merge branch from 73e364c to aa36049 Compare April 28, 2026 10:25
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@helpers/mirror/merge_mirror_sets.py`:
- Around line 16-20: The subprocess calls can hang and currently lose
stderr/stdout on error; update run_json(argv, check=False) to pass a sensible
timeout (e.g., timeout=30), catch subprocess.TimeoutExpired and
subprocess.CalledProcessError, and raise RuntimeError containing the command,
stdout and stderr (and timeout info) instead of returning or printing minimal
text; similarly, replace or wrap the direct subprocess.run(...) usage in the
other block (the explicit subprocess invocation around the current code using
subprocess.run(...) directly) to use the same timeout and error-handling pattern
(catch TimeoutExpired/CalledProcessError and re-raise RuntimeError with full
cmd, stdout, stderr) so callers get full context and hung calls are bounded.
- Around line 100-120: The merge currently reads the live object via
subprocess.run([args.oc_bin, "get" ...]) and computes merged docs using
merge_by_source, which can be lost by concurrent deploys; wrap the
read/merge/apply sequence in a retry loop that: 1) reads the current resource
(the existing = json.loads(current.stdout) path), 2) computes merged spec via
merge_by_source(existing_list, new_list), 3) writes/applies the manifest (the
code that creates merged_docs and calls oc apply), then 4) re-reads the live
resource and compares the applied spec to the current live spec; if they differ
(another actor changed it), recompute merge against the new live spec and retry
the apply, with a small backoff and a bounded max attempts, surfacing a clear
error if retries are exhausted. Ensure you reuse the same unique symbols
(merge_by_source, merged_docs, args.oc_bin/args.oc_resource) so the loop updates
the same doc entries and avoids the race.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 332b4c36-a8c9-46e9-b2b9-1013b9ef2ec1

📥 Commits

Reviewing files that changed from the base of the PR and between ef1d2b4 and aa36049.

📒 Files selected for processing (5)
  • helpers/mirror/merge_mirror_sets.py
  • operators/quay-operator/quay_disconnected_mirrors.yaml
  • playbooks/tasks/deploy_plugin.yaml
  • playbooks/tasks/merge_mirror_manifests.yaml
  • playbooks/tasks/mirror_plugin.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • playbooks/tasks/deploy_plugin.yaml

Comment thread helpers/mirror/merge_mirror_sets.py Outdated
Comment thread helpers/mirror/merge_mirror_sets.py Outdated
@danielerez danielerez force-pushed the mgmt-23874-plugin-mirror-merge branch 2 times, most recently from 4d6be85 to 79d94c0 Compare April 28, 2026 13:45
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
helpers/mirror/merge_mirror_sets.py (2)

201-223: Minor: Variable name retries_exhausted is misleading.

The variable at line 212 holds the list of resource names that failed to match, not a boolean indicating exhaustion. Consider renaming for clarity.

♻️ Suggested rename
-    retries_exhausted = []
+    mismatched_resources = []
     for attempt in range(1, args.max_attempts + 1):
         merged_docs, expected_specs = build_merged_docs(args, docs)
         payload = {"apiVersion": "v1", "kind": "List", "items": merged_docs}
         with open(args.output_manifest, "w", encoding="utf-8") as f:
             json.dump(payload, f, indent=2)

         if not args.apply:
             return

         apply_payload(args)
-        retries_exhausted = live_matches_expected(args, expected_specs)
-        if not retries_exhausted:
+        mismatched_resources = live_matches_expected(args, expected_specs)
+        if not mismatched_resources:
             return

         if attempt < args.max_attempts:
             time.sleep(args.retry_delay * attempt)

     raise RuntimeError(
         "concurrent update detected while applying merged mirrors for "
-        f"{args.oc_resource}: {', '.join(retries_exhausted)} "
+        f"{args.oc_resource}: {', '.join(mismatched_resources)} "
         f"(exhausted {args.max_attempts} attempts)"
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@helpers/mirror/merge_mirror_sets.py` around lines 201 - 223, The name
retries_exhausted is misleading because it holds the list of resource names that
failed to match, not a boolean; rename this variable (e.g., failed_resources or
unmatched_resources) everywhere it's used in merge_mirror_sets.py (including
where it's assigned from live_matches_expected(...) and used in the final
RuntimeError message and loop logic) and ensure the error text still references
the renamed variable; keep all behavior the same (assignment from
live_matches_expected, truthiness checks, and join in the RuntimeError) and
update any nearby comments or log messages to use the new, clearer identifier.

66-70: Consider handling JSONDecodeError in run_json.

If oc returns malformed JSON (e.g., partial output on timeout or unexpected error messages), json.loads() will raise JSONDecodeError with a potentially cryptic message. Wrapping this could provide better context.

♻️ Optional improvement
 def run_json(argv, check=False):
     result = run_command(argv, check=check)
     if not result.stdout:
         return {}
-    return json.loads(result.stdout)
+    try:
+        return json.loads(result.stdout)
+    except json.JSONDecodeError as exc:
+        raise RuntimeError(
+            f"failed to parse JSON from command {_format_cmd(argv)!r}: {exc}"
+        ) from exc
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@helpers/mirror/merge_mirror_sets.py` around lines 66 - 70, The run_json
function currently calls json.loads(result.stdout) without handling malformed
output; wrap the json.loads call in a try/except that catches
json.JSONDecodeError, and when caught include the raw result.stdout and the
original exception in a new, clearer error (or log it) before re-raising or
returning a controlled value; update references in run_json and keep usage of
run_command/result to preserve behavior while surfacing the stdout content for
debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@helpers/mirror/merge_mirror_sets.py`:
- Around line 201-223: The name retries_exhausted is misleading because it holds
the list of resource names that failed to match, not a boolean; rename this
variable (e.g., failed_resources or unmatched_resources) everywhere it's used in
merge_mirror_sets.py (including where it's assigned from
live_matches_expected(...) and used in the final RuntimeError message and loop
logic) and ensure the error text still references the renamed variable; keep all
behavior the same (assignment from live_matches_expected, truthiness checks, and
join in the RuntimeError) and update any nearby comments or log messages to use
the new, clearer identifier.
- Around line 66-70: The run_json function currently calls
json.loads(result.stdout) without handling malformed output; wrap the json.loads
call in a try/except that catches json.JSONDecodeError, and when caught include
the raw result.stdout and the original exception in a new, clearer error (or log
it) before re-raising or returning a controlled value; update references in
run_json and keep usage of run_command/result to preserve behavior while
surfacing the stdout content for debugging.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 04018df2-c49b-420f-820e-e203b9a25ef9

📥 Commits

Reviewing files that changed from the base of the PR and between aa36049 and 79d94c0.

📒 Files selected for processing (5)
  • helpers/mirror/merge_mirror_sets.py
  • operators/quay-operator/quay_disconnected_mirrors.yaml
  • playbooks/tasks/deploy_plugin.yaml
  • playbooks/tasks/merge_mirror_manifests.yaml
  • playbooks/tasks/mirror_plugin.yaml
✅ Files skipped from review due to trivial changes (2)
  • playbooks/tasks/deploy_plugin.yaml
  • operators/quay-operator/quay_disconnected_mirrors.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • playbooks/tasks/merge_mirror_manifests.yaml

deploy_plugin.yaml previously overwrote internal IDMS/ITMS during
each plugin deployment. Because oc-mirror generates IDMS/ITMS with
the same resource name for every plugin, the latest plugin run
replaced mirror mappings from earlier runs.
This change keeps plugin mirror entries merged so mappings
accumulate across plugins.

I.e. This change isolates plugin mirror resources under a dedicated
"-plugins" suffix and introduces a merge helper that combines mirror
entries by source before apply.
The result is an additive behavior: new plugin mirrors are appended
without clobbering existing IDMS/ITMS mappings, while non-plugin
flows keep using the original internal resource path unchanged.

Example resulting resources:
$ oc get idms
NAME
idms-operator-0
idms-operator-0-internal
idms-operator-0-plugins
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

operators Operator installation/config plugins

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants