Add iterate module for MR/PR lifecycle helpers#49
Conversation
📝 WalkthroughWalkthroughThis PR introduces Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Action performedReview finished.
|
Provides reusable utilities for iterating on merge/pull requests: extracting MR URLs from comments, counting iterations, checking readiness, resolving review threads, checking MR state, and collecting feedback. These building blocks support any skill that creates an MR/PR and refines it through review cycles. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Emilien Macchi <emacchi@redhat.com>
519eb2c to
e942d13
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/test_iterate.py (1)
137-147: ⚡ Quick winAdd a newest-first ordering test for readiness detection
This suite only verifies oldest→newest ordering. Add one case with reversed comment order to lock behavior and prevent regressions around Line 101–Line 107 in
src/agentic_ci/iterate.py.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_iterate.py` around lines 137 - 147, Add a new test that verifies readiness detection when comments are provided newest-first: create a test (e.g., test_last_comment_wins_newest_first) that builds the same two comment dicts used in test_last_comment_wins but in reverse order (newest first) and assert already_notified_ready(comments) is False; this ensures the logic in already_notified_ready still treats the last bot comment as authoritative regardless of input ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/agentic_ci/iterate.py`:
- Around line 101-107: The current detection of the "last bot comment" is
order-dependent; fix the logic in iterate.py by explicitly finding the most
recent bot comment instead of assuming comments are oldest→newest: locate the
last bot comment by either iterating reversed(comments) to pick the first match
or by selecting the comment with the newest timestamp (e.g., max(comments,
key=lambda c: c.get("created_at"))) that matches bot_pattern, then set
last_bot_body from that comment and evaluate the existing readiness check using
last_bot_body.
- Around line 48-53: The regexes currently use \S+ which can capture trailing
punctuation; update the two re.search patterns in iterate.py (the calls
assigning to variable m) to use a stricter repo path pattern and a
boundary/lookahead so links don't include trailing punctuation — e.g. replace
\S+ with a pattern that matches repo paths like
[A-Za-z0-9_.-]+(?:/[A-Za-z0-9_.-]+)+ and append a lookahead such as
(?=$|\s|[)\]\>.,'"]) or a negative lookahead for common trailing punctuation;
ensure you update both the GitLab pattern (https?://gitlab\.com/...) and the
GitHub pattern (https?://github\.com/...) so returned m.group(0) never contains
trailing markdown punctuation.
---
Nitpick comments:
In `@tests/test_iterate.py`:
- Around line 137-147: Add a new test that verifies readiness detection when
comments are provided newest-first: create a test (e.g.,
test_last_comment_wins_newest_first) that builds the same two comment dicts used
in test_last_comment_wins but in reverse order (newest first) and assert
already_notified_ready(comments) is False; this ensures the logic in
already_notified_ready still treats the last bot comment as authoritative
regardless of input ordering.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 6b82da33-a768-40e6-bb80-e6b4bf06dcb4
📒 Files selected for processing (2)
src/agentic_ci/iterate.pytests/test_iterate.py
| m = re.search(r"https?://gitlab\.com/\S+/-/merge_requests/\d+", body) | ||
| if m: | ||
| return m.group(0) | ||
| m = re.search(r"https?://github\.com/\S+/pull/\d+", body) | ||
| if m: | ||
| return m.group(0) |
There was a problem hiding this comment.
Overly permissive URL regex can return malformed MR/PR links
At Line 48 and Line 51, \S+ may include trailing punctuation from markdown text (for example ) or .), which can break downstream forge detection/parsing.
Suggested fix
- m = re.search(r"https?://gitlab\.com/\S+/-/merge_requests/\d+", body)
+ m = re.search(
+ r"https?://gitlab\.com/[A-Za-z0-9._\-/]+/-/merge_requests/\d+\b",
+ body,
+ )
if m:
- return m.group(0)
- m = re.search(r"https?://github\.com/\S+/pull/\d+", body)
+ return m.group(0).rstrip(").,;:!?]>\"'")
+ m = re.search(
+ r"https?://github\.com/[A-Za-z0-9._-]+/[A-Za-z0-9._-]+/pull/\d+\b",
+ body,
+ )
if m:
- return m.group(0)
+ return m.group(0).rstrip(").,;:!?]>\"'")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| m = re.search(r"https?://gitlab\.com/\S+/-/merge_requests/\d+", body) | |
| if m: | |
| return m.group(0) | |
| m = re.search(r"https?://github\.com/\S+/pull/\d+", body) | |
| if m: | |
| return m.group(0) | |
| m = re.search( | |
| r"https?://gitlab\.com/[A-Za-z0-9._\-/]+/-/merge_requests/\d+\b", | |
| body, | |
| ) | |
| if m: | |
| return m.group(0).rstrip(").,;:!?]>\"'") | |
| m = re.search( | |
| r"https?://github\.com/[A-Za-z0-9._-]+/[A-Za-z0-9._-]+/pull/\d+\b", | |
| body, | |
| ) | |
| if m: | |
| return m.group(0).rstrip(").,;:!?]>\"'") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/agentic_ci/iterate.py` around lines 48 - 53, The regexes currently use
\S+ which can capture trailing punctuation; update the two re.search patterns in
iterate.py (the calls assigning to variable m) to use a stricter repo path
pattern and a boundary/lookahead so links don't include trailing punctuation —
e.g. replace \S+ with a pattern that matches repo paths like
[A-Za-z0-9_.-]+(?:/[A-Za-z0-9_.-]+)+ and append a lookahead such as
(?=$|\s|[)\]\>.,'"]) or a negative lookahead for common trailing punctuation;
ensure you update both the GitLab pattern (https?://gitlab\.com/...) and the
GitHub pattern (https?://github\.com/...) so returned m.group(0) never contains
trailing markdown punctuation.
| for c in comments: | ||
| if bot_pattern.search(c.get("body", "")): | ||
| last_bot_body = c.get("body", "") | ||
| return ( | ||
| "ready for a maintainer to review and merge" in last_bot_body | ||
| or "ready for a maintainer to merge" in last_bot_body | ||
| ) |
There was a problem hiding this comment.
“Last bot comment” logic is order-dependent and can return wrong readiness state
At Line 101–Line 107, the function assumes comments are oldest→newest. If the caller provides newest-first (common API behavior), readiness detection flips.
Suggested fix
- last_bot_body = ""
- for c in comments:
- if bot_pattern.search(c.get("body", "")):
- last_bot_body = c.get("body", "")
+ bot_comments = [c for c in comments if bot_pattern.search(c.get("body", ""))]
+ if not bot_comments:
+ return False
+
+ # Prefer explicit timestamps when present; fallback to input order.
+ def _created_at(comment: dict) -> str:
+ return str(comment.get("created_at", ""))
+
+ last_bot_body = max(bot_comments, key=_created_at).get("body", "") if any(
+ "created_at" in c for c in bot_comments
+ ) else bot_comments[-1].get("body", "")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/agentic_ci/iterate.py` around lines 101 - 107, The current detection of
the "last bot comment" is order-dependent; fix the logic in iterate.py by
explicitly finding the most recent bot comment instead of assuming comments are
oldest→newest: locate the last bot comment by either iterating
reversed(comments) to pick the first match or by selecting the comment with the
newest timestamp (e.g., max(comments, key=lambda c: c.get("created_at"))) that
matches bot_pattern, then set last_bot_body from that comment and evaluate the
existing readiness check using last_bot_body.
There was a problem hiding this comment.
Valid, we should sort the comments.
There was a problem hiding this comment.
@mprpic, here's a clean minimal patch for already_notified_ready that sorts by created_at when timestamps are present, and falls back safely to the last element in input order:
def already_notified_ready(
comments: list[dict],
bot_pattern: re.Pattern[str] | None = None,
) -> bool:
if bot_pattern is None:
bot_pattern = re.compile(
rf"^(h3\.\s+|#{{1,6}}\s+)?{re.escape(DEFAULT_BOT_IDENTIFIER)}",
re.MULTILINE,
)
bot_comments = [c for c in comments if bot_pattern.search(c.get("body", ""))]
if not bot_comments:
return False
if any("created_at" in c for c in bot_comments):
last = max(bot_comments, key=lambda c: c.get("created_at", ""))
else:
last = bot_comments[-1]
last_bot_body = last.get("body", "")
return (
"ready for a maintainer to review and merge" in last_bot_body
or "ready for a maintainer to merge" in last_bot_body
)Key changes:
- Collects all bot comments first; returns
Falseearly if none found (avoids a silent miss). - Uses
max(..., key=created_at)when timestamps exist — handles both oldest→newest and newest→first API responses correctly. - Falls back to
bot_comments[-1](last in input order) when nocreated_atkeys are present, preserving existing behaviour for callers that pre-sort.
🧠 Learnings used
Learnt from: mprpic
Repo: opendatahub-io/agentic-ci PR: 21
File: src/agentic_ci/log.py:24-26
Timestamp: 2026-05-23T02:11:19.002Z
Learning: When reviewing this repo’s code, don’t conflate `agentic_ci.log.info` with Python’s stdlib `logging.Logger.info`. `src/agentic_ci/log.py` defines an `info(msg: str)` helper that only accepts a single string, so calls to `agentic_ci.log.info` must pass a fully-formed message (no `%`/format placeholders with extra arguments). If you see `log.info` usage like `log.info("[%s] ...", arg1, ...)`, that pattern matches stdlib `logging.Logger`’s `%`-style formatting (i.e., multiple arguments), not the `agentic_ci.log.info` helper—review suggestions should respect which one is being called.
| return None | ||
|
|
||
|
|
||
| def count_iterations( |
There was a problem hiding this comment.
Is this Jira comments or MR/PR comments? I think Jira, in which case it should probably be a part of the jira module.
| Args: | ||
| comments: List of comment dicts with ``"body"`` keys. | ||
| bot_pattern: Compiled regex to identify bot comments. | ||
| Defaults to matching ``jira-autofix bot`` headers. |
There was a problem hiding this comment.
It defaults to DEFAULT_BOT_IDENTIFIER.
| Args: | ||
| comments: List of comment dicts with ``"body"`` keys. | ||
| bot_pattern: Compiled regex to identify bot comments. | ||
| Defaults to matching ``jira-autofix bot`` headers. |
| ) | ||
|
|
||
|
|
||
| def resolve_threads( |
There was a problem hiding this comment.
Should this be a method of the Forge class then? Same goes for the two methods below that require forge access.
| DEFAULT_BOT_IDENTIFIER = "agentic-ci bot" | ||
|
|
||
|
|
||
| def extract_mr_url( |
There was a problem hiding this comment.
I think this should just be a method in the Forge class; perhaps different per forge type since the logic here is slighly changed for github vs gitlab.
| 1 | ||
| for c in comments | ||
| if bot_identifier in c.get("body", "") and pattern.search(c.get("body", "")) | ||
| ) |
There was a problem hiding this comment.
If we have comments that have multiple iterations in a single comment, this needs to be updated to findall() since search() will return the first pattern only.
Summary
agentic_ci.iteratemodule with reusable MR/PR iteration lifecycle helpersextract_mr_url,count_iterations,already_notified_ready,resolve_threads,check_mr_state,collect_feedbackTest plan
pytest tests/test_iterate.py— 29 tests passruff checkpassesruff format --checkpasses🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests