fix(KONFLUX-13012): clean up orphaned IRs before creating new ones#734
fix(KONFLUX-13012): clean up orphaned IRs before creating new ones#734swickersh wants to merge 1 commit into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #734 +/- ##
==========================================
+ Coverage 95.37% 95.65% +0.27%
==========================================
Files 63 72 +9
Lines 6318 7111 +793
==========================================
+ Hits 6026 6802 +776
- Misses 292 309 +17
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report in Codecov by Harness.
🚀 New features to boost your workflow:
|
|
/ok-to-test |
659455f to
0fc6ae5
Compare
|
/retest |
Review Summary by QodoClean up orphaned InternalRequests before creating new ones
WalkthroughsDescription• Clean up orphaned InternalRequests before creating new ones • Prevents duplicate Pulp pushes and CGW errors on task retry • Deletes existing IRs with matching PipelineRun UID label • Includes comprehensive test coverage for cleanup logic Diagramflowchart LR
A["Task Retry Triggered"] --> B["Extract PipelineRun UID Label"]
B --> C["Query Existing InternalRequests"]
C --> D{"IRs Found?"}
D -->|Yes| E["Delete Orphaned IRs"]
E --> F["Wait 5 Seconds"]
F --> G["Create New InternalRequest"]
D -->|No| G
G --> H["PipelineRun Proceeds"]
File Changes1. utils/internal-request
|
Code Review by Qodo
Context used 1. Cleanup ignores delete failures
|
47504be to
7f6eb7e
Compare
johnbieren
left a comment
There was a problem hiding this comment.
What about two managed tasks that both create internalRequests running in parallel for the same plr? Shouldn't we also use the taskRunID label?
@johnbieren really good catch! Instead, I'm trying to have the internal-request script automatically stamp a internal-services.appstudio.openshift.io/pipeline-name label (from the --pipeline argument it already receives) onto every IR it creates. The cleanup selector then becomes pipelinerun-uid=X,pipeline-name=push-disk-images. Since each parallel task in a release pipeline calls a different internal pipeline, this scopes the cleanup to only IRs from this specific task and no changes needed in the calling tasks. I'll run some tests to make sure this works. |
love it, good idea |
@johnbieren The orphan was deleted and a fresh IR created as expected. |
|
@swickersh the Qodo AI review flagged potential bugs or security concerns in this PR. ## Qodo Response
**<Finding title>** — <Your explanation (false positive / fixed in commit / accepted risk)>
**<Another finding>** — <Your response>
|
1 similar comment
|
@swickersh the Qodo AI review flagged potential bugs or security concerns in this PR. ## Qodo Response
**<Finding title>** — <Your explanation (false positive / fixed in commit / accepted risk)>
**<Another finding>** — <Your response>
|
Production Approval Record
|
Qodo ResponseCleanup ignores delete failures — Fixed in current code. Removed
Fragile IR list parsing — Fixed in current code. Replaced cc @ach912 |
|
@johnbieren since we reverted konflux-ci/internal-services#709 should we wait to merge this? |
|
@swickersh gitlint is saying you are at 73 chars instead of 72 (idk why it passes on the PR but not in merge queue) |
🤷♂️ you tell me. If you think it will break things, can you move it to draft? If you think it is a good change, we can merge it. It will need to be followed by a catalog update which then has to go through the promotion process |
This is because the merge queue commit appends 7 chars (a space plus (# 734)) to the original commit. We'd need logic in the gitlint command to limit 65 chars in the original commit title.
yea, sorry. That's on me to check closer. I checked the logic and walked through the scenario with Cursor but it steered me wrong on the 709 change that broke stage so I'm overly cautious now. This is what cursor is telling me which seems to check out. It seems fine to merge prior to the finalizer change landing in internal-services |
good catch about the gitlint thing. how annoying... |
|
/retest |
When the managed task is retried (e.g. due to timeout), the previous InternalRequest and its PipelineRun on the internal cluster continue running. The retry creates a new InternalRequest, leading to concurrent operations that cause duplicate Pulp pushes and CGW "Record already present" errors. Before creating a new InternalRequest, the script now deletes any existing InternalRequests with the same PipelineRun UID label. Combined with the finalizer change in internal-services(PR 709), this ensures the associated PipelineRun is cancelled before the new one starts. Assisted-by: Cursor AI Signed-off-by: Scott Wickersham <swickers@redhat.com>
ec666a3
|
Merge ci failed due to python docstring requirements that were added. |
When the managed task is retried (e.g. due to timeout), the previous InternalRequest and its PipelineRun on the internal cluster continue running. The retry creates a new InternalRequest, leading to concurrent operations that cause duplicate Pulp pushes and CGW "Record already present" errors.
Before creating a new InternalRequest, the script now deletes any existing InternalRequests with the same PipelineRun UID label. Combined with the finalizer change in internal-services(PR 709), this ensures the associated PipelineRun is cancelled before the new one starts.
Assisted-by: Cursor AI