MGMT-23874: fix plugin IDMS/ITMS overwrite during mirror deploy#305
MGMT-23874: fix plugin IDMS/ITMS overwrite during mirror deploy#305danielerez wants to merge 1 commit intorh-ecosystem-edge:mainfrom
Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
WalkthroughAdds a new CLI tool that merges mirror entries from rendered oc-mirror manifests into live cluster ImageDigestMirrorSet/ImageTagMirrorSet resources, and integrates that merge step into Ansible playbooks so plugin-provided mirror manifests can be merged into landing-zone outputs before application. Changes
Sequence Diagram(s)sequenceDiagram
participant Ansible as Ansible Playbook
participant MergeTask as merge_mirror_manifests.yaml
participant MergeScript as merge_mirror_sets.py
participant OcCLI as oc CLI
participant Cluster as Kubernetes Cluster
Ansible->>MergeTask: iterate merge_mirror_items / call merge task
MergeTask->>MergeScript: run_json(input_manifest, api_kind, spec_key, output_manifest, --apply?)
MergeScript->>OcCLI: oc create --dry-run=client -f input_manifest -o json
OcCLI-->>MergeScript: rendered manifest JSON
Loop for matching docs
MergeScript->>OcCLI: oc get <resource> <name> -o json
OcCLI->>Cluster: fetch live resource
Cluster-->>OcCLI: live resource JSON / NotFound
OcCLI-->>MergeScript: existing spec JSON / error
MergeScript->>MergeScript: merge_by_source(existing.spec[spec_key], rendered.spec[spec_key])
end
MergeScript-->>MergeTask: write merged List JSON to output_manifest
alt --apply set
MergeTask->>OcCLI: oc apply -f output_manifest
OcCLI->>Cluster: apply manifests
Cluster-->>OcCLI: apply responses
MergeScript->>OcCLI: oc get ... verify merged state (with retries)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@helpers/mirror/merge_mirror_sets.py`:
- Around line 100-107: The current logic treats any non-zero return from
subprocess.run([... args.oc_bin, "get", args.oc_resource, name ...]) as "not
found" and falls back to appending doc to merged_docs; change this so only a
real not-found error is treated as missing: inspect current.returncode and
current.stderr (or stdout) and if the oc error message indicates "NotFound"
(e.g. "Error from server (NotFound)" or "not found") then append doc and
continue, but for any other error from subprocess.run raise/exit (or log and
abort) so you don't downgrade RBAC/API/transient failures into an unmerged
apply; keep references to the same symbols current, subprocess.run, args.oc_bin,
args.oc_resource, name, merged_docs and doc when implementing the check.
In `@playbooks/tasks/deploy_plugin.yaml`:
- Around line 401-402: The earlier landing-zone apply that runs oc apply -f
.../cluster-resources/ is still overwriting prior plugin mirror mappings; update
that task to follow the same isolation/merge behavior as the Quay Enterprise
include by honoring the quay_merge_plugin_mirrors flag: either change the apply
target to the plugin-specific suffixed oc-mirror output (the per-plugin
cluster-resources path) or switch the apply to merge mode when
quay_merge_plugin_mirrors is true so it won’t replace existing mirror mappings;
ensure the task that performs the initial oc apply uses the same
conditional/merge logic as the later block that depends on
quay_merge_plugin_mirrors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 999482e7-ab53-4283-8b0b-5d88485e9d59
📒 Files selected for processing (3)
helpers/mirror/merge_mirror_sets.pyoperators/quay-operator/quay_disconnected_mirrors.yamlplaybooks/tasks/deploy_plugin.yaml
ef1d2b4 to
73e364c
Compare
|
can we consider an approach that every plugin uses a custom catalog? that would remove the concern of overwriting completely. made the same suggestion in #288 (comment) as for naming, i've created #306 which may help, if we adopt the above suggestion. |
73e364c to
aa36049
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@helpers/mirror/merge_mirror_sets.py`:
- Around line 16-20: The subprocess calls can hang and currently lose
stderr/stdout on error; update run_json(argv, check=False) to pass a sensible
timeout (e.g., timeout=30), catch subprocess.TimeoutExpired and
subprocess.CalledProcessError, and raise RuntimeError containing the command,
stdout and stderr (and timeout info) instead of returning or printing minimal
text; similarly, replace or wrap the direct subprocess.run(...) usage in the
other block (the explicit subprocess invocation around the current code using
subprocess.run(...) directly) to use the same timeout and error-handling pattern
(catch TimeoutExpired/CalledProcessError and re-raise RuntimeError with full
cmd, stdout, stderr) so callers get full context and hung calls are bounded.
- Around line 100-120: The merge currently reads the live object via
subprocess.run([args.oc_bin, "get" ...]) and computes merged docs using
merge_by_source, which can be lost by concurrent deploys; wrap the
read/merge/apply sequence in a retry loop that: 1) reads the current resource
(the existing = json.loads(current.stdout) path), 2) computes merged spec via
merge_by_source(existing_list, new_list), 3) writes/applies the manifest (the
code that creates merged_docs and calls oc apply), then 4) re-reads the live
resource and compares the applied spec to the current live spec; if they differ
(another actor changed it), recompute merge against the new live spec and retry
the apply, with a small backoff and a bounded max attempts, surfacing a clear
error if retries are exhausted. Ensure you reuse the same unique symbols
(merge_by_source, merged_docs, args.oc_bin/args.oc_resource) so the loop updates
the same doc entries and avoids the race.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 332b4c36-a8c9-46e9-b2b9-1013b9ef2ec1
📒 Files selected for processing (5)
helpers/mirror/merge_mirror_sets.pyoperators/quay-operator/quay_disconnected_mirrors.yamlplaybooks/tasks/deploy_plugin.yamlplaybooks/tasks/merge_mirror_manifests.yamlplaybooks/tasks/mirror_plugin.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- playbooks/tasks/deploy_plugin.yaml
4d6be85 to
79d94c0
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
helpers/mirror/merge_mirror_sets.py (2)
201-223: Minor: Variable nameretries_exhaustedis misleading.The variable at line 212 holds the list of resource names that failed to match, not a boolean indicating exhaustion. Consider renaming for clarity.
♻️ Suggested rename
- retries_exhausted = [] + mismatched_resources = [] for attempt in range(1, args.max_attempts + 1): merged_docs, expected_specs = build_merged_docs(args, docs) payload = {"apiVersion": "v1", "kind": "List", "items": merged_docs} with open(args.output_manifest, "w", encoding="utf-8") as f: json.dump(payload, f, indent=2) if not args.apply: return apply_payload(args) - retries_exhausted = live_matches_expected(args, expected_specs) - if not retries_exhausted: + mismatched_resources = live_matches_expected(args, expected_specs) + if not mismatched_resources: return if attempt < args.max_attempts: time.sleep(args.retry_delay * attempt) raise RuntimeError( "concurrent update detected while applying merged mirrors for " - f"{args.oc_resource}: {', '.join(retries_exhausted)} " + f"{args.oc_resource}: {', '.join(mismatched_resources)} " f"(exhausted {args.max_attempts} attempts)" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@helpers/mirror/merge_mirror_sets.py` around lines 201 - 223, The name retries_exhausted is misleading because it holds the list of resource names that failed to match, not a boolean; rename this variable (e.g., failed_resources or unmatched_resources) everywhere it's used in merge_mirror_sets.py (including where it's assigned from live_matches_expected(...) and used in the final RuntimeError message and loop logic) and ensure the error text still references the renamed variable; keep all behavior the same (assignment from live_matches_expected, truthiness checks, and join in the RuntimeError) and update any nearby comments or log messages to use the new, clearer identifier.
66-70: Consider handlingJSONDecodeErrorinrun_json.If
ocreturns malformed JSON (e.g., partial output on timeout or unexpected error messages),json.loads()will raiseJSONDecodeErrorwith a potentially cryptic message. Wrapping this could provide better context.♻️ Optional improvement
def run_json(argv, check=False): result = run_command(argv, check=check) if not result.stdout: return {} - return json.loads(result.stdout) + try: + return json.loads(result.stdout) + except json.JSONDecodeError as exc: + raise RuntimeError( + f"failed to parse JSON from command {_format_cmd(argv)!r}: {exc}" + ) from exc🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@helpers/mirror/merge_mirror_sets.py` around lines 66 - 70, The run_json function currently calls json.loads(result.stdout) without handling malformed output; wrap the json.loads call in a try/except that catches json.JSONDecodeError, and when caught include the raw result.stdout and the original exception in a new, clearer error (or log it) before re-raising or returning a controlled value; update references in run_json and keep usage of run_command/result to preserve behavior while surfacing the stdout content for debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@helpers/mirror/merge_mirror_sets.py`:
- Around line 201-223: The name retries_exhausted is misleading because it holds
the list of resource names that failed to match, not a boolean; rename this
variable (e.g., failed_resources or unmatched_resources) everywhere it's used in
merge_mirror_sets.py (including where it's assigned from
live_matches_expected(...) and used in the final RuntimeError message and loop
logic) and ensure the error text still references the renamed variable; keep all
behavior the same (assignment from live_matches_expected, truthiness checks, and
join in the RuntimeError) and update any nearby comments or log messages to use
the new, clearer identifier.
- Around line 66-70: The run_json function currently calls
json.loads(result.stdout) without handling malformed output; wrap the json.loads
call in a try/except that catches json.JSONDecodeError, and when caught include
the raw result.stdout and the original exception in a new, clearer error (or log
it) before re-raising or returning a controlled value; update references in
run_json and keep usage of run_command/result to preserve behavior while
surfacing the stdout content for debugging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 04018df2-c49b-420f-820e-e203b9a25ef9
📒 Files selected for processing (5)
helpers/mirror/merge_mirror_sets.pyoperators/quay-operator/quay_disconnected_mirrors.yamlplaybooks/tasks/deploy_plugin.yamlplaybooks/tasks/merge_mirror_manifests.yamlplaybooks/tasks/mirror_plugin.yaml
✅ Files skipped from review due to trivial changes (2)
- playbooks/tasks/deploy_plugin.yaml
- operators/quay-operator/quay_disconnected_mirrors.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- playbooks/tasks/merge_mirror_manifests.yaml
deploy_plugin.yaml previously overwrote internal IDMS/ITMS during each plugin deployment. Because oc-mirror generates IDMS/ITMS with the same resource name for every plugin, the latest plugin run replaced mirror mappings from earlier runs. This change keeps plugin mirror entries merged so mappings accumulate across plugins. I.e. This change isolates plugin mirror resources under a dedicated "-plugins" suffix and introduces a merge helper that combines mirror entries by source before apply. The result is an additive behavior: new plugin mirrors are appended without clobbering existing IDMS/ITMS mappings, while non-plugin flows keep using the original internal resource path unchanged. Example resulting resources: $ oc get idms NAME idms-operator-0 idms-operator-0-internal idms-operator-0-plugins
79d94c0 to
11bb8d9
Compare
deploy_plugin.yaml previously overwrote internal IDMS/ITMS during each plugin deployment.
Because oc-mirror generates IDMS/ITMS with the same resource name for every plugin,
the latest plugin run replaced mirror mappings from earlier runs.
This change keeps plugin mirror entries merged so mappings accumulate across plugins.
I.e. This change isolates plugin mirror resources under a dedicated "-plugins" suffix
and introduces a merge helper that combines mirror entries by source before apply.
The result is an additive behavior: new plugin mirrors are appended without clobbering
existing IDMS/ITMS mappings, while non-plugin flows keep using the original
internal resource path unchanged.
Example resulting resources:
$ oc get idms
NAME
idms-operator-0
idms-operator-0-internal
idms-operator-0-plugins
Summary by CodeRabbit
New Features
Chores