feat: refactor cleanup cronjob to use pagination and reduce memory usage by davidmogar · Pull Request #411 · redhat-appstudio/infra-common-deployments

davidmogar · 2026-05-29T11:24:46Z

Replace AWK-based processing with kubectl pagination API to handle large resource sets more efficiently. Process resources in batches of 100 using continuation tokens, and stream through jq to minimize memory footprint. Increase memory limit from 256Mi to 512Mi to accommodate batch processing. Add deletion and error counters for better observability.

Generated-By: Claude Sonnet 4.5 noreply@anthropic.com

qodo-for-redhat-appstudio · 2026-05-29T11:25:08Z

Review Summary by Qodo

(Agentic_describe updated until commit `f49ddc7`)

Refactor cleanup cronjob with pagination and memory optimization

✨ Enhancement

Walkthroughs

Description

• Replace AWK-based processing with kubectl pagination API for efficient batch handling
• Process resources in batches of 100 using --chunk-size parameter
• Stream JSON output through jq to minimize memory footprint
• Increase memory limit from 256Mi to 512Mi for batch processing
• Add deletion and error counters for improved observability

Diagram

flowchart LR
  A["kubectl get with --chunk-size=100"] --> B["JSON output to temp file"]
  B --> C["jq stream processing"]
  C --> D["Parse name:namespace:completionTime"]
  D --> E["Date conversion to epoch"]
  E --> F["kubectl delete with counters"]
  F --> G["Summary with DELETED_COUNT and ERROR_COUNT"]

File Changes

1. components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml ✨ Enhancement +35/-26

Pagination-based cleanup with memory optimization

• Replace AWK-based date parsing with kubectl --chunk-size=100 pagination API
• Switch from template output to JSON format processed by jq for memory efficiency
• Implement streaming while loop with epoch date conversion for resource filtering
• Add DELETED_COUNT and ERROR_COUNT tracking with informational logging
• Increase memory limits from 256Mi to 512Mi for both requests and limits
• Improve error handling with stderr capture and cleanup of temporary files

components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml

qodo-for-redhat-appstudio · 2026-05-29T11:25:09Z

Code Review by Qodo

🐞 Bugs (6) 📘 Rule violations (0)

1. Tempfiles on read-only FS 🐞 Bug ☼ Reliability ⭐ New

Description

The script now uses mktemp without targeting the writable /var/tmp mount while
readOnlyRootFilesystem: true, so temp file creation (and the kubectl get redirect) can fail and
break the cleanup job. This is a regression vs the internal-production manifest which explicitly
places temp files under /var/tmp.

Code

components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[R45-46]

Relevance

⭐⭐⭐ High
Cleanup cronjob historically used mktemp -p /var/tmp with readOnlyRootFilesystem, implying
/var/tmp is required.
PR-#242
PR-#326

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
In internal-staging, temp files are created with plain mktemp, but the pod mounts only /var/tmp
as writable while the root filesystem is read-only. The internal-production manifest in the same
repo explicitly uses mktemp -p /var/tmp, indicating /var/tmp is the intended scratch location.
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[44-56]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[83-99]
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[38-40]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The cronjob creates temp files via `mktemp` with no directory, which defaults to `/tmp`. This CronJob runs with `readOnlyRootFilesystem: true` and only mounts an `emptyDir` at `/var/tmp`, so creating temp files in `/tmp` is likely to fail.

### Issue Context
The manifest already mounts `/var/tmp` specifically for scratch space, and the internal-production variant uses `mktemp -p /var/tmp`.

### Fix Focus Areas
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[36-56]
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[80-99]

### Suggested change
- Set `TMPDIR=/var/tmp` or update both tempfiles to `mktemp -p /var/tmp` (for `KUBECTL_OUTPUT` and `KUBECTL_ERR`).
- (Optional hardening) Add a `trap` to delete temp files on exit.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Stderr mixed into JSON 🐞 Bug ☼ Reliability

Description

kubectl get ... -o json redirects stderr into the same file as stdout (2>&1), so
warnings/deprecation messages can corrupt the JSON. This can make jq fail and/or cause the
pagination continue-token extraction to break, leading to incomplete cleanup without a clear
failure.

Code

components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[R57-84]

Relevance

⭐⭐⭐ High
Team repeatedly changed this cronjob for kubectl reliability; likely accept separating stderr to
keep JSON parseable.
PR-#242
PR-#244
PR-#246

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Both manifests redirect kubectl stderr into the JSON output file and then immediately parse the
same file with jq for items and the continue token, so any stderr output will break JSON
parsing/pagination.
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[56-66]
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[82-84]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[56-66]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[82-84]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The script writes both stdout and stderr from `kubectl get ... -o json` into `$BATCH_OUTPUT`, then parses `$BATCH_OUTPUT` as JSON with `jq`. Any stderr output can corrupt the JSON and break processing/pagination.

### Issue Context
The file is later used for `.items[]` processing and `.metadata.continue` extraction.

### Fix
Write stdout (JSON) to `$BATCH_OUTPUT` and stderr to a separate file/stream. On failure, surface stderr in logs:

```bash
BATCH_OUTPUT=$(mktemp -p /var/tmp)
BATCH_ERR=$(mktemp -p /var/tmp)
if ! kubectl ... -o json >"$BATCH_OUTPUT" 2>"$BATCH_ERR"; then
 echo "ERROR: failed to list ${CR_TYPE} resources"
 cat "$BATCH_ERR" >&2
 rm -f "$BATCH_OUTPUT" "$BATCH_ERR"
 exit 1
fi
rm -f "$BATCH_ERR"
```

Also consider checking `jq` exit codes explicitly and failing the job if JSON parsing fails.

### Fix Focus Areas
- components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[56-85]
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[56-85]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. Parse failure triggers deletion 🐞 Bug ≡ Correctness

Description

If date -d fails to parse a completionTime, the code forces completion_epoch to 0, which makes
the "old enough" check pass and can delete resources despite an unknown completion time. This turns
parsing glitches/data issues into unintended deletions.

Code

components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[R68-72]

Relevance

⭐⭐⭐ High
Prior cronjob work emphasized safety; not deleting on parse failure matches “avoid accidental
deletion” direction.
PR-#242

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Both manifests explicitly map date parsing failures to epoch 0 and then use that value for the
deletion decision, which can cause deletions when completionTime is not parseable.
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[66-72]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[66-72]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
On date parsing failure, `completion_epoch` becomes `0`, which will almost always satisfy `SECONDS_BACK > completion_epoch` and incorrectly trigger deletion.

### Issue Context
This is especially risky in a cleanup job because it can delete resources you cannot confidently classify as "older than".

### Fix
Treat parse failures as errors and skip (or fail fast). Example:

```bash
if ! completion_epoch=$(date -d "$completion_time" +%s 2>/dev/null); then
 echo "ERROR: Unparseable completionTime for ${cr_name} in ${cr_namespace}: ${completion_time}"
 ERROR_COUNT=$((ERROR_COUNT + 1))
 continue
fi
```

### Fix Focus Areas
- components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[66-79]
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[66-79]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

View more (1)

4. Eval command injection risk 🐞 Bug ⛨ Security

Description

The script constructs a shell command string from CR_TYPE/CR_NAMESPACE/LABELS/CONTINUE and executes
it with eval, so metacharacters in those values can execute arbitrary commands. This is an
avoidable injection risk in a privileged automation job.

Code

components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[R50-59]

Relevance

⭐⭐ Medium
No prior repo feedback on avoiding eval; some PRs flagged “Possible security concern” but not this
pattern.
PR-#384

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Both cronjobs build CMD using unquoted variable expansions and execute it via eval, which will
interpret shell syntax present in those variable values.
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[49-59]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[49-59]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
A kubectl command is assembled into a string and executed via `eval`, which enables shell injection through variable contents.

### Issue Context
Variables are currently interpolated without robust quoting, and `eval` will interpret any metacharacters.

### Fix
Avoid `eval` entirely. Build arguments as a bash array and invoke `kubectl` directly:

```bash
kubectl_args=(get "$CR_TYPE" -n "$CR_NAMESPACE" -l "$LABELS" --limit="$BATCH_SIZE" -o json)
if [ -n "$CONTINUE" ]; then
 kubectl_args+=(--continue="$CONTINUE")
fi

if ! kubectl "${kubectl_args[@]}" >"$BATCH_OUTPUT" 2>"$BATCH_ERR"; then
 ...
fi
```

### Fix Focus Areas
- components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[49-62]
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[49-62]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

5. jq errors not detected 🐞 Bug ☼ Reliability ⭐ New

Description

The cleanup loop consumes jq output via process substitution but never checks whether jq
succeeded, so missing jq / JSON parse failures can make the job report "Cleanup complete" with
zero deletions/errors while actually doing no work. This breaks the observability the PR is trying
to add and can leave resources unpruned without a failing exit code.

Code

components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[R57-76]

Relevance

⭐⭐ Medium
Cleanup cronjob reliability fixes common, but no history about checking jq exit status in process
substitution loops.
PR-#242
PR-#246
PR-#244

ⓘ Recommendations generated based on similar findings in past PRs

Evidence

The script feeds the loop from jq via process substitution and has no conditional/exit-path that
handles jq failure, yet it always prints a completion message afterward. This means jq can fail
without being reflected in the job result or counters.

components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[57-79]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The script uses `done < <(jq ...)` but never verifies `jq` executed successfully. If `jq` is missing or the JSON is malformed, the `while` loop will read nothing and the script will still print the final success line.

### Issue Context
This cronjob is intended to be an automated cleanup mechanism; silently doing nothing defeats its purpose and makes operational debugging hard.

### Fix Focus Areas
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[35-79]

### Suggested change
- Add an explicit dependency check early: `command -v jq >/dev/null || { echo "ERROR: jq not found"; exit 1; }`.
- Capture and validate jq success before entering the loop, e.g.:
 - `JQ_OUTPUT=$(mktemp -p /var/tmp)`
 - `if ! jq -r '...' "$KUBECTL_OUTPUT" > "$JQ_OUTPUT"; then echo "ERROR: jq failed"; ERROR_COUNT=$((ERROR_COUNT+1)); exit 1; fi`
 - `while ...; do ...; done < "$JQ_OUTPUT"`
 - cleanup temp files via `trap`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

6. Counters updated in subshell 🐞 Bug ≡ Correctness

Description

DELETED_COUNT and ERROR_COUNT are incremented inside a while loop that is fed by a pipe from jq,
which runs the loop in a bash subshell. The job will still delete resources, but the final "Cleanup
complete" log will always report 0 deletions/errors, breaking the observability this PR is adding.

Code

components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[R65-80]

Relevance

⭐⭐⭐ High
Cronjob hardening emphasized observability; fixing subshell counter bug aligns with prior
reliability-focused edits.
PR-#242

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Both cronjobs pipe jq output into while ...; do ...; done and increment counters inside that
loop; in bash this loop runs in a subshell, so the final echo uses unchanged parent-shell variables.
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[33-35]
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[64-80]
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[94-94]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[64-80]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[94-94]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The counters are incremented in a `while` loop that is part of a pipeline (`jq | while ...`), so the loop runs in a subshell and counter updates are lost.

### Issue Context
This manifests as `Deleted: 0, Errors: 0` even when deletions occurred.

### Fix
Use process substitution (or read from a temp file) so the `while` loop runs in the current shell:

```bash
while IFS=: read -r cr_name cr_namespace completion_time; do
 ...
done < <(jq -r '.items[] | select(.status.completionTime != null) | "\(.metadata.name):\(.metadata.namespace):\(.status.completionTime)"' "$BATCH_OUTPUT")
```

### Fix Focus Areas
- components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[64-80]
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[64-80]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Previous review results

Review updated until commit f49ddc7

Results up to commit 1742ffb

🐞 Bugs (4) 📘 Rule violations (0) 📎 Requirement gaps (0)

1. Stderr mixed into JSON 🐞 Bug ☼ Reliability

Description

kubectl get ... -o json redirects stderr into the same file as stdout (2>&1), so
warnings/deprecation messages can corrupt the JSON. This can make jq fail and/or cause the
pagination continue-token extraction to break, leading to incomplete cleanup without a clear
failure.

Code

components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[R57-84]

Relevance

⭐⭐⭐ High
Team repeatedly changed this cronjob for kubectl reliability; likely accept separating stderr to
keep JSON parseable.
PR-#242
PR-#244
PR-#246

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Both manifests redirect kubectl stderr into the JSON output file and then immediately parse the
same file with jq for items and the continue token, so any stderr output will break JSON
parsing/pagination.
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[56-66]
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[82-84]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[56-66]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[82-84]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The script writes both stdout and stderr from `kubectl get ... -o json` into `$BATCH_OUTPUT`, then parses `$BATCH_OUTPUT` as JSON with `jq`. Any stderr output can corrupt the JSON and break processing/pagination.

### Issue Context
The file is later used for `.items[]` processing and `.metadata.continue` extraction.

### Fix
Write stdout (JSON) to `$BATCH_OUTPUT` and stderr to a separate file/stream. On failure, surface stderr in logs:

```bash
BATCH_OUTPUT=$(mktemp -p /var/tmp)
BATCH_ERR=$(mktemp -p /var/tmp)
if ! kubectl ... -o json >"$BATCH_OUTPUT" 2>"$BATCH_ERR"; then
 echo "ERROR: failed to list ${CR_TYPE} resources"
 cat "$BATCH_ERR" >&2
 rm -f "$BATCH_OUTPUT" "$BATCH_ERR"
 exit 1
fi
rm -f "$BATCH_ERR"
```

Also consider checking `jq` exit codes explicitly and failing the job if JSON parsing fails.

### Fix Focus Areas
- components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[56-85]
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[56-85]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Parse failure triggers deletion 🐞 Bug ≡ Correctness

Description

If date -d fails to parse a completionTime, the code forces completion_epoch to 0, which makes
the "old enough" check pass and can delete resources despite an unknown completion time. This turns
parsing glitches/data issues into unintended deletions.

Code

components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[R68-72]

Relevance

⭐⭐⭐ High
Prior cronjob work emphasized safety; not deleting on parse failure matches “avoid accidental
deletion” direction.
PR-#242

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Both manifests explicitly map date parsing failures to epoch 0 and then use that value for the
deletion decision, which can cause deletions when completionTime is not parseable.
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[66-72]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[66-72]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
On date parsing failure, `completion_epoch` becomes `0`, which will almost always satisfy `SECONDS_BACK > completion_epoch` and incorrectly trigger deletion.

### Issue Context
This is especially risky in a cleanup job because it can delete resources you cannot confidently classify as "older than".

### Fix
Treat parse failures as errors and skip (or fail fast). Example:

```bash
if ! completion_epoch=$(date -d "$completion_time" +%s 2>/dev/null); then
 echo "ERROR: Unparseable completionTime for ${cr_name} in ${cr_namespace}: ${completion_time}"
 ERROR_COUNT=$((ERROR_COUNT + 1))
 continue
fi
```

### Fix Focus Areas
- components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[66-79]
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[66-79]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. Eval command injection risk 🐞 Bug ⛨ Security

Description

The script constructs a shell command string from CR_TYPE/CR_NAMESPACE/LABELS/CONTINUE and executes
it with eval, so metacharacters in those values can execute arbitrary commands. This is an
avoidable injection risk in a privileged automation job.

Code

components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[R50-59]

Relevance

⭐⭐ Medium
No prior repo feedback on avoiding eval; some PRs flagged “Possible security concern” but not this
pattern.
PR-#384

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Both cronjobs build CMD using unquoted variable expansions and execute it via eval, which will
interpret shell syntax present in those variable values.
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[49-59]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[49-59]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
A kubectl command is assembled into a string and executed via `eval`, which enables shell injection through variable contents.

### Issue Context
Variables are currently interpolated without robust quoting, and `eval` will interpret any metacharacters.

### Fix
Avoid `eval` entirely. Build arguments as a bash array and invoke `kubectl` directly:

```bash
kubectl_args=(get "$CR_TYPE" -n "$CR_NAMESPACE" -l "$LABELS" --limit="$BATCH_SIZE" -o json)
if [ -n "$CONTINUE" ]; then
 kubectl_args+=(--continue="$CONTINUE")
fi

if ! kubectl "${kubectl_args[@]}" >"$BATCH_OUTPUT" 2>"$BATCH_ERR"; then
 ...
fi
```

### Fix Focus Areas
- components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[49-62]
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[49-62]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. Counters updated in subshell 🐞 Bug ≡ Correctness

Description

DELETED_COUNT and ERROR_COUNT are incremented inside a while loop that is fed by a pipe from jq,
which runs the loop in a bash subshell. The job will still delete resources, but the final "Cleanup
complete" log will always report 0 deletions/errors, breaking the observability this PR is adding.

Code

components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[R65-80]

Relevance

⭐⭐⭐ High
Cronjob hardening emphasized observability; fixing subshell counter bug aligns with prior
reliability-focused edits.
PR-#242

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Both cronjobs pipe jq output into while ...; do ...; done and increment counters inside that
loop; in bash this loop runs in a subshell, so the final echo uses unchanged parent-shell variables.
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[33-35]
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[64-80]
components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[94-94]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[64-80]
components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[94-94]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The counters are incremented in a `while` loop that is part of a pipeline (`jq | while ...`), so the loop runs in a subshell and counter updates are lost.

### Issue Context
This manifests as `Deleted: 0, Errors: 0` even when deletions occurred.

### Fix
Use process substitution (or read from a temp file) so the `while` loop runs in the current shell:

```bash
while IFS=: read -r cr_name cr_namespace completion_time; do
 ...
done < <(jq -r '.items[] | select(.status.completionTime != null) | "\(.metadata.name):\(.metadata.namespace):\(.status.completionTime)"' "$BATCH_OUTPUT")
```

### Fix Focus Areas
- components/internal-services/internal-production/cronjob/cleanup-internal-requests-pipelineruns.yaml[64-80]
- components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[64-80]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

mmalina · 2026-05-29T11:31:22Z

Where is your Assisted-by line? ;)

FilipNikolovski · 2026-05-29T11:41:02Z

+                  CONTINUE=""
+                  while true; do
+                    # Build kubectl command with limit and continue token
+                    CMD="kubectl get ${CR_TYPE} -n ${CR_NAMESPACE} -l ${LABELS} --limit=${BATCH_SIZE}"


kubectl get does not have a --limit flag unfortunately, it has a --chunk-size which still returns all items, but in chunks, which I don't think is what we want here.

Perhaps we can use something like:

kubectl get --raw "/api/v1/{cr_type}?limit=${BATCH_SIZE}"

and parse the raw response, to fetch the continue token and do the whole pagination manually.

I changed it to chunk-size.

Using chunk-size would still keep all items in memory in the $output var, so we would hit the same issue here I think.

so we would hit the same issue here

I guess that depends on what was eating the resources, right? the only improvement with --chunk-size is that kubectl itself should perform better, but the rest will be exactly the same.

I am not sure how chunk size works, if it loads everything and just display in chunks, probably the issue will continue. If this does not solve, we could try adding the completionTIme as a customfiled selector to the CR` so we might be able to "pre-filter" the internalrequests: and not need the mechanics we are doing now:
Reference: https://kubernetes.io/docs/concepts/overview/working-with-objects/field-selectors/#custom-resources-fields

The field selector supports ==, != so "maybe" we might be able to filter with the desired date (I did not test, though).

It seem chunk-size will solve it:

The Mechanism Instead of making one massive, stressful API call that requests all 100,000 items at once (which could time out or crash the API server), kubectl breaks the request into 100 smaller sequential batches. kubectl asks the API server for the first 1,000 items. The server sends them back along with a continue token. kubectl prints those 1,000 items to your terminal, then immediately uses the token to ask for the next 1,000. This loops automatically until all 100,000 items are printed. The Bottom Line: --chunk-size is a tool for performance and reliability, not for truncating your terminal output.

yeah, so this confirms what I said earlier:

I guess that depends on what was eating the resources, right? the only improvement with --chunk-size is that kubectl itself should perform better, but the rest will be exactly the same.

filariow · 2026-05-29T11:44:04Z

+                  CONTINUE=""
+                  while true; do
+                    # Build kubectl command with limit and continue token
+                    CMD="kubectl get ${CR_TYPE} -n ${CR_NAMESPACE} -l ${LABELS} --limit=${BATCH_SIZE}"


--limit seems to be an hallucination

➜ kubectl version Client Version: v1.35.5 Kustomize Version: v5.7.1 Server Version: v1.35.0 ➜ kubectl get pods --limit=4 error: unknown flag: --limit See 'kubectl get --help' for usage.

Totally! Moved to chunk-size.

mmalina · 2026-05-29T11:50:18Z

+                    BATCH_OUTPUT=$(mktemp -p /var/tmp)
+                    if ! eval "$CMD" > "$BATCH_OUTPUT" 2>&1; then
+                      echo "ERROR: failed to list ${CR_TYPE} resources"
+                      rm -f "$BATCH_OUTPUT"


why bother removing the file? it's a container.

The code changed but I'd say nothing wrong removing the file here. Happy to remove the cleanup lines if you prefer.

I'd say it's just redundant - we just never do that. but if you prefer to keep it, fine with me.

mmalina · 2026-05-29T11:50:28Z

+
+                    # Check for continuation token
+                    CONTINUE=$(jq -r '.metadata.continue // empty' "$BATCH_OUTPUT")
+                    rm -f "$BATCH_OUTPUT"


why bother removing the file? it's a container.

The code changed but I'd say nothing wrong removing the file here. Happy to remove the cleanup lines if you prefer.

I'd say it's just redundant - we just never do that. but if you prefer to keep it, fine with me.

seanconroy2021 · 2026-05-29T12:14:03Z

+                    fi
+
+                    # Process each item in the batch individually to minimize memory
+                    jq -r '.items[] | select(.status.completionTime != null) | "\(.metadata.name):\(.metadata.namespace):\(.status.completionTime)"' "$BATCH_OUTPUT" | \


jq | while runs the while loop. DELETED_COUNT and ERROR_COUNT will always be 0 in the final log, regardless of actual deletions. Since it runs in a subshell.
Guess something like:

while IFS=: read -r cr_name cr_namespace completion_time; do done < <(jq -r <> "$BATCH_OUTPUT")

Should be fixed.

seanconroy2021 · 2026-05-29T12:16:55Z

-                      echo "ERROR: namespace=${namespace} ${output}"
+                      # Check if old enough to delete
+                      if [ "$SECONDS_BACK" -gt "$completion_epoch" ]; then
+                        if kubectl delete "${CR_TYPE}" "${cr_name}" -n "${cr_namespace}" --wait=false --timeout=30s 2>&1; then


The old code outputs the delete log. Now we don't show the output, we won't see the reason it failed.

Now it's displayed again.

davidmogar · 2026-05-29T13:39:59Z

Where is your Assisted-by line? ;)

The PR can be garbage and I still need to work on it and test it, but the commit contains Generated-By ;)

qodo-for-redhat-appstudio · 2026-06-01T07:25:19Z

Code review by qodo was updated up to the latest commit f49ddc7

davidmogar · 2026-06-01T07:25:58Z

Sean kindly tested this PR (thanks Sean) and got a successful run:

╰─❯ bash /tmp/script.sh
INFO: Starting cleanup for pipelineruns in default older than now
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-45dj7" deleted from default namespace
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-4kjr6" deleted from default namespace
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-6m9r5" deleted from default namespace
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-b8bhw" deleted from default namespace
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-d2hch" deleted from default namespace
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-djtsg" deleted from default namespace
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-vjpjj" deleted from default namespace
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-wgh2s" deleted from default namespace
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-whb5g" deleted from default namespace
INFO: namespace=default [pipelinerun.tekton.dev](http://pipelinerun.tekton.dev/) "dummy-cleanup-test-wlnm5" deleted from default namespace
INFO: Cleanup complete. Deleted: 10, Errors: 0

qodo-for-redhat-appstudio · 2026-06-01T07:30:11Z

+                  KUBECTL_OUTPUT=$(mktemp)
+                  KUBECTL_ERR=$(mktemp)


1. Tempfiles on read-only fs 🐞 Bug ☼ Reliability

The script now uses mktemp without targeting the writable /var/tmp mount while readOnlyRootFilesystem: true, so temp file creation (and the kubectl get redirect) can fail and break the cleanup job. This is a regression vs the internal-production manifest which explicitly places temp files under /var/tmp.

Agent Prompt

### Issue description The cronjob creates temp files via `mktemp` with no directory, which defaults to `/tmp`. This CronJob runs with `readOnlyRootFilesystem: true` and only mounts an `emptyDir` at `/var/tmp`, so creating temp files in `/tmp` is likely to fail. ### Issue Context The manifest already mounts `/var/tmp` specifically for scratch space, and the internal-production variant uses `mktemp -p /var/tmp`. ### Fix Focus Areas - components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[36-56] - components/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml[80-99] ### Suggested change - Set `TMPDIR=/var/tmp` or update both tempfiles to `mktemp -p /var/tmp` (for `KUBECTL_OUTPUT` and `KUBECTL_ERR`). - (Optional hardening) Add a `trap` to delete temp files on exit.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

seanconroy2021 · 2026-06-04T10:32:47Z

lgtm :)

FilipNikolovski · 2026-06-04T10:44:20Z

lgtm

mmalina · 2026-06-10T11:05:26Z

Where is your Assisted-by line? ;)

The PR can be garbage and I still need to work on it and test it, but the commit contains Generated-By ;)

I know, but I thought we were specifically required to use Assisted-by:

openshift-ci · 2026-06-24T16:00:03Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davidmogar, theflockers

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~components/internal-services/OWNERS~~ [davidmogar,theflockers]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Replace AWK-based processing with kubectl pagination API to handle large resource sets more efficiently. Process resources in batches of 100 using continuation tokens, and stream through jq to minimize memory footprint. Increase memory limit from 256Mi to 512Mi to accommodate batch processing. Add deletion and error counters for better observability. Generated-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: David Moreno García <damoreno@redhat.com>

openshift-ci · 2026-06-30T10:10:21Z

New changes are detected. LGTM label has been removed.

davidmogar requested review from happybhati, johnbieren, mmalina, release-service-bot and theflockers as code owners May 29, 2026 11:24

openshift-ci Bot requested review from FilipNikolovski and seanconroy2021 May 29, 2026 11:24

openshift-ci Bot added the approved label May 29, 2026

qodo-for-redhat-appstudio Bot reviewed May 29, 2026

View reviewed changes

davidmogar force-pushed the oomkill branch from 1742ffb to 692a149 Compare May 29, 2026 11:30

mmalina reviewed May 29, 2026

View reviewed changes

Comment thread ...nents/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml Outdated

FilipNikolovski reviewed May 29, 2026

View reviewed changes

mmalina reviewed May 29, 2026

View reviewed changes

Comment thread ...nents/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml Outdated

filariow requested changes May 29, 2026

View reviewed changes

openshift-ci Bot assigned filariow May 29, 2026

mmalina reviewed May 29, 2026

View reviewed changes

Comment thread ...nents/internal-services/internal-staging/cronjob/cleanup-internal-requests-pipelineruns.yaml Outdated

mmalina reviewed May 29, 2026

View reviewed changes

seanconroy2021 reviewed May 29, 2026

View reviewed changes

davidmogar marked this pull request as draft May 29, 2026 13:38

openshift-ci Bot added the do-not-merge/work-in-progress label May 29, 2026

davidmogar force-pushed the oomkill branch from 692a149 to f49ddc7 Compare May 29, 2026 13:45

davidmogar marked this pull request as ready for review June 1, 2026 07:24

openshift-ci Bot removed the do-not-merge/work-in-progress label Jun 1, 2026

openshift-ci Bot requested review from FilipNikolovski and mmalina June 1, 2026 07:24

qodo-for-redhat-appstudio Bot reviewed Jun 1, 2026

View reviewed changes

openshift-ci Bot added the needs-rebase label Jun 13, 2026

theflockers approved these changes Jun 24, 2026

View reviewed changes

openshift-ci Bot assigned theflockers Jun 24, 2026

openshift-ci Bot added the lgtm label Jun 24, 2026

davidmogar force-pushed the oomkill branch from f49ddc7 to 60848ad Compare June 30, 2026 10:10

openshift-ci Bot removed the lgtm label Jun 30, 2026

openshift-ci Bot removed the needs-rebase label Jun 30, 2026

davidmogar requested review from filariow and theflockers June 30, 2026 10:15

Uh oh!

Conversation

davidmogar commented May 29, 2026

Uh oh!

qodo-for-redhat-appstudio Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Summary by Qodo

(Agentic_describe updated until commit f49ddc7)

Walkthroughs

File Changes

Uh oh!

qodo-for-redhat-appstudio Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Previous review results

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mmalina commented May 29, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidmogar commented May 29, 2026

Uh oh!

qodo-for-redhat-appstudio Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidmogar commented Jun 1, 2026

Uh oh!

qodo-for-redhat-appstudio Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

seanconroy2021 commented Jun 4, 2026

Uh oh!

FilipNikolovski commented Jun 4, 2026

Uh oh!

mmalina commented Jun 10, 2026

Uh oh!

qodo-for-redhat-appstudio Bot commented May 29, 2026 •

edited

Loading

(Agentic_describe updated until commit `f49ddc7`)

qodo-for-redhat-appstudio Bot commented May 29, 2026 •

edited

Loading

qodo-for-redhat-appstudio Bot commented Jun 1, 2026 •

edited

Loading