feat(integration-tests): Add archive-manager action and verification infra. by quinntaylormitchell · Pull Request #2233 · y-scope/clp

quinntaylormitchell · 2026-04-29T20:35:18Z

Description

This PR adds the action and verification infrastructure for testing the package archive-manager.

Checklist

The PR satisfies the contribution guidelines.
This is a breaking change and that has been indicated in the PR title, OR this isn't a
breaking change.
Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Tested on development branch; all tests pass.

Summary by CodeRabbit

New Features
- Added integration-test utilities for the archive manager to run and verify find, delete-by-IDs and delete-by-filter workflows with optional dataset and time-range options.
- Introduced structured command-argument models and a standardized VerificationResult to support consistent, verifiable test actions.
- ExternalAction now optionally carries structured args and exposes combined command output for verification.
Tests
- Extended test helpers to capture combined command output and validate post-operation states across scenarios.

coderabbitai · 2026-04-29T20:35:27Z

Walkthrough

Adds a new integration-test utility module for archive-manager that constructs CLI find/del actions, parses find output to extract archive UUIDs, and verifies deletions. Also introduces CmdArgs, attaches optional structured args to ExternalAction, adds get_output(), and adds a VerificationResult dataclass.

Changes

Cohort / File(s)	Summary
Archive Manager Integration Tests `integration-tests/tests/package_tests/utils/archive_manager.py`	New module providing `ArchiveManagerArgs` (implements `CmdArgs.to_cmd()`), enums for subcommands, action builders (`archive_manager_find`, `archive_manager_del_by_ids`, `archive_manager_del_by_filter`), a parser `_extract_archive_ids_from_find_output()`, and verification functions (`verify_archive_manager_find_action`, `verify_archive_manager_del_by_ids_action`, `verify_archive_manager_del_by_filter_action`). `del by-ids` action fails hard on empty ID list.
Base Testing Classes `integration-tests/tests/utils/classes.py`	Introduced abstract `CmdArgs` (Pydantic model + abstract `to_cmd()`), added frozen `VerificationResult` (with truthiness and `ok()`/`fail()`), extended `ExternalAction` with optional `args: CmdArgs

Sequence Diagram

sequenceDiagram
    participant Test as Test Code
    participant Action as ExternalAction
    participant Runner as Subprocess Runner
    participant AM as archive-manager CLI
    participant Parser as Output Parser

    Test->>Action: construct ExternalAction (args: ArchiveManagerArgs)
    Action->>Runner: run CLI command (from args.to_cmd())
    Runner->>AM: invoke archive-manager (find / del ...)
    AM-->>Runner: stdout/stderr ("Found N ..." + UUID lines)
    Runner-->>Action: return CompletedProcess (stdout/stderr, code)
    Action->>Parser: _extract_archive_ids_from_find_output(output)
    Parser-->>Action: list of UUIDs
    Action->>Test: verification function uses find calls and UUID sets -> VerificationResult

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately and clearly describes the main addition: new archive-manager action and verification infrastructure for integration tests.
Docstring Coverage	✅ Passed	Docstring coverage is 94.12% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Ruff (0.15.12)

integration-tests/tests/package_tests/utils/archive_manager.py

�[1;31mruff failed�[0m
�[1mCause:�[0m Failed to load extended configuration /tools/yscope-dev-utils/exports/lint-configs/python/ruff.toml (/integration-tests/pyproject.toml extends /tools/yscope-dev-utils/exports/lint-configs/python/ruff.toml)
�[1mCause:�[0m Failed to read /tools/yscope-dev-utils/exports/lint-configs/python/ruff.toml
�[1mCause:�[0m No such file or directory (os error 2)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

integration-tests/tests/package_tests/utils/archive_manager.py (1)

366-387: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Validate extracted UUID count against reported archive count.

The parser reads Found N archives... but never verifies that exactly N UUIDs were extracted. If output format drifts, verification can silently weaken and produce false positives.

Proposed fix

     for line in filtered_lines:
         match = uuid_pattern.search(line)
         if match:
             output_archive_id_list.append(match.group(0))
 
+    if len(output_archive_id_list) != num_archive_ids:
+        pytest.fail(
+            "archive-manager 'find' output parsing failure: expected "
+            f"{num_archive_ids} archive IDs, extracted {len(output_archive_id_list)}."
+        )
+
     return sorted(output_archive_id_list)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration-tests/tests/package_tests/utils/archive_manager.py` around lines
366 - 387, The parser currently counts reported archives in num_archive_ids and
extracts UUIDs into output_archive_id_list but never verifies they match; update
the logic after extracting UUIDs (using uuid_pattern, filtered_lines) to compare
len(output_archive_id_list) against num_archive_ids and fail fast if they
differ—raise a clear exception (e.g., ValueError/RuntimeError) or return an
error indicating "expected N archives, found M UUIDs" so callers know parsing
failed; keep returning the sorted list only when counts match.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/utils/archive_manager.py`:
- Around line 340-344: The pytest.fail message in the archive-manager 'del'
verification omits the nested failure details; update the pytest.fail call that
runs when not find_result to include find_result.failure_message (and optionally
a short label like "find failure:") alongside find_action.log_file_path so the
failure message contains both the subprocess log path and the underlying
find_result.failure_message for easier triage; locate the failing block that
references find_result and find_action.log_file_path and
append/find_result.failure_message into the formatted string.
- Around line 196-246: The comparison at the end of verify logic uses direct
list equality between current_archive_id_list and directories_in_archive_dir,
which can false-fail if the same IDs are present in different orders; before the
final equality check (where VerificationResult.ok() is returned) sort both
current_archive_id_list and directories_in_archive_dir (e.g., in-place or by
creating sorted copies) so the order is normalized; locate the assembly of
current_archive_id_list (from _extract_archive_ids_from_find_output and
archive_manager_find calls) and the final comparison block and apply sorting to
both lists prior to comparison.

---

Outside diff comments:
In `@integration-tests/tests/package_tests/utils/archive_manager.py`:
- Around line 366-387: The parser currently counts reported archives in
num_archive_ids and extracts UUIDs into output_archive_id_list but never
verifies they match; update the logic after extracting UUIDs (using
uuid_pattern, filtered_lines) to compare len(output_archive_id_list) against
num_archive_ids and fail fast if they differ—raise a clear exception (e.g.,
ValueError/RuntimeError) or return an error indicating "expected N archives,
found M UUIDs" so callers know parsing failed; keep returning the sorted list
only when counts match.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9de8ed4d-dca4-419c-a4e8-0c79299cdeda

📥 Commits

Reviewing files that changed from the base of the PR and between 298e2e2 and 93c1c9b.

📒 Files selected for processing (2)

integration-tests/tests/package_tests/utils/archive_manager.py
integration-tests/tests/utils/classes.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/utils/archive_manager.py`:
- Around line 400-407: The UUID extraction is too permissive: replace
uuid_pattern.search(line) with a strict match against the entire stripped line
(use fullmatch on the same uuid_pattern and line.strip()) so only lines that are
exactly a UUID are accepted into output_archive_id_list; after collecting IDs,
parse the reported "Found N archives" output line (or the variable that contains
the reported count) and compare that integer to len(output_archive_id_list), and
raise/return a fast failure if they differ so tests fail when the output format
drifts. Ensure you update the code around uuid_pattern, filtered_lines, and
output_archive_id_list to use fullmatch and the count validation.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4eff6982-f306-4274-b916-72429c126fcb

📥 Commits

Reviewing files that changed from the base of the PR and between 93c1c9b and aee0b86.

📒 Files selected for processing (1)

integration-tests/tests/package_tests/utils/archive_manager.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

integration-tests/tests/package_tests/utils/archive_manager.py (1)

414-420: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Tighten UUID parsing to exact-line matches.

search() will accept UUID substrings embedded in warnings or other log lines, so the parser can pick up the wrong archive IDs and make verification succeed or fail on the wrong data. Match the stripped line end-to-end and validate the parsed count before returning.

Suggested fix

     uuid_pattern = re.compile(
         r"[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}"
     )
     for line in filtered_lines:
-        match = uuid_pattern.search(line)
-        if match:
-            output_archive_id_list.append(match.group(0))
+        candidate = line.strip()
+        if uuid_pattern.fullmatch(candidate):
+            output_archive_id_list.append(candidate)
+
+    if len(output_archive_id_list) != num_archive_ids:
+        pytest.fail(
+            "Failed to parse archive-manager 'find' output: expected "
+            f"{num_archive_ids} archive IDs, got {len(output_archive_id_list)}."
+        )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration-tests/tests/package_tests/utils/archive_manager.py` around lines
414 - 420, The UUID parsing currently uses uuid_pattern.search which can match
embedded UUIDs; change to match the whole stripped line by either adding anchors
to uuid_pattern (r"^[0-9a-fA-F]{8}-...{12}$") or use uuid_pattern.fullmatch on
line.strip() when iterating filtered_lines and appending to
output_archive_id_list; after collecting matches, validate the parsed count
before returning (e.g., assert or raise if len(output_archive_id_list) is
unexpected or zero) so we only accept exact-line UUIDs and fail fast on
mismatches.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/utils/archive_manager.py`:
- Around line 120-139: If ids is empty (either caller passed [] or find returned
no matches), fail fast instead of building a del-by-ids command; after computing
ids (after _extract_archive_ids_from_find_output and before constructing
ArchiveManagerArgs), add a guard that checks for empty list and calls
pytest.fail with a clear message indicating no archive IDs were provided/found
for deletion. Reference the variables/functions ids, archive_manager_find,
verify_archive_manager_find_action, _extract_archive_ids_from_find_output, and
the
ArchiveManagerArgs/ClpPackageArchiveManagerSubcommand/ClpPackageArchiveManagerDelSubcommand
block so the check is placed immediately before creating ArchiveManagerArgs.

---

Duplicate comments:
In `@integration-tests/tests/package_tests/utils/archive_manager.py`:
- Around line 414-420: The UUID parsing currently uses uuid_pattern.search which
can match embedded UUIDs; change to match the whole stripped line by either
adding anchors to uuid_pattern (r"^[0-9a-fA-F]{8}-...{12}$") or use
uuid_pattern.fullmatch on line.strip() when iterating filtered_lines and
appending to output_archive_id_list; after collecting matches, validate the
parsed count before returning (e.g., assert or raise if
len(output_archive_id_list) is unexpected or zero) so we only accept exact-line
UUIDs and fail fast on mismatches.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a0da7ff7-519a-47f7-b1cb-87c6c75a9f1b

📥 Commits

Reviewing files that changed from the base of the PR and between aee0b86 and da52ec7.

📒 Files selected for processing (1)

integration-tests/tests/package_tests/utils/archive_manager.py

coderabbitai

♻️ Duplicate comments (1)

integration-tests/tests/package_tests/utils/archive_manager.py (1)

398-416: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden find-output parsing to prevent false verification results.

Line 412 uses permissive UUID extraction (search) and there’s no check that parsed UUID count matches the reported “Found N archives…”. This can mis-parse warning lines and let verification drift go undetected.

Proposed fix

 def _extract_archive_ids_from_find_output(
     action: ExternalAction,
 ) -> list[str]:
     output_archive_id_list: list[str] = []
     output_lines = action.get_output().splitlines()

-    num_archive_ids = 0
+    num_archive_ids = 0
     filtered_lines = []
     for line in output_lines:
         match = re.search(r"Found (\d+) archives within the specified time range", line)
         if match:
             num_archive_ids = int(match.group(1))
         else:
             filtered_lines.append(line)

     if num_archive_ids == 0:
         return output_archive_id_list

     uuid_pattern = re.compile(
         r"[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}"
     )
     for line in filtered_lines:
-        match = uuid_pattern.search(line)
-        if match:
-            output_archive_id_list.append(match.group(0))
+        candidate = line.strip()
+        if uuid_pattern.fullmatch(candidate):
+            output_archive_id_list.append(candidate)
+
+    if len(output_archive_id_list) != num_archive_ids:
+        pytest.fail(
+            "Failed to parse archive-manager 'find' output: expected "
+            f"{num_archive_ids} archive IDs, got {len(output_archive_id_list)}. "
+            f"Subprocess log: '{action.log_file_path}'"
+        )

     return sorted(output_archive_id_list)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration-tests/tests/package_tests/utils/archive_manager.py` around lines
398 - 416, The parsing loop can falsely extract UUIDs from warnings because
uuid_pattern.search is permissive and there's no verification against the
reported count (num_archive_ids); update the logic in the function that builds
output_archive_id_list to collect all UUIDs from filtered_lines using
uuid_pattern.findall (or by iterating and using full matches) into a temporary
list, then assert or handle if len(temp_list) != num_archive_ids (e.g.,
raise/return an error or empty list) before assigning to output_archive_id_list
and returning sorted(output_archive_id_list); reference the variables and
symbols num_archive_ids, uuid_pattern, filtered_lines, output_archive_id_list
and the final return to locate where to add this validation.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@integration-tests/tests/package_tests/utils/archive_manager.py`:
- Around line 398-416: The parsing loop can falsely extract UUIDs from warnings
because uuid_pattern.search is permissive and there's no verification against
the reported count (num_archive_ids); update the logic in the function that
builds output_archive_id_list to collect all UUIDs from filtered_lines using
uuid_pattern.findall (or by iterating and using full matches) into a temporary
list, then assert or handle if len(temp_list) != num_archive_ids (e.g.,
raise/return an error or empty list) before assigning to output_archive_id_list
and returning sorted(output_archive_id_list); reference the variables and
symbols num_archive_ids, uuid_pattern, filtered_lines, output_archive_id_list
and the final return to locate where to add this validation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2115dac8-9dd1-4fdc-9277-91a16c3f1f4d

📥 Commits

Reviewing files that changed from the base of the PR and between da52ec7 and 40c08b7.

📒 Files selected for processing (1)

integration-tests/tests/package_tests/utils/archive_manager.py

quinntaylormitchell added 2 commits April 29, 2026 20:26

Add archive-manager infra.

fd66aee

Lint.

ac86ef0

Docstrings.

93c1c9b

quinntaylormitchell changed the title ~~feat(integration-tests): Add archive-manager infra.~~ feat(integration-tests): Add archive-manager test utilities. Apr 29, 2026

quinntaylormitchell changed the title ~~feat(integration-tests): Add archive-manager test utilities.~~ feat(integration-tests): Add archive-manager action and verification infra. Apr 29, 2026

quinntaylormitchell marked this pull request as ready for review April 29, 2026 23:41

quinntaylormitchell requested a review from a team as a code owner April 29, 2026 23:41

quinntaylormitchell requested a review from Bill-hbrhbr April 29, 2026 23:41

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread integration-tests/tests/package_tests/utils/archive_manager.py Outdated

Comment thread integration-tests/tests/package_tests/utils/archive_manager.py

Rabbit.

aee0b86

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread integration-tests/tests/package_tests/utils/archive_manager.py

Rabbit.

da52ec7

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread integration-tests/tests/package_tests/utils/archive_manager.py Outdated

Reject empty delete-by-IDs.

40c08b7

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(integration-tests): Add archive-manager action and verification infra.#2233

feat(integration-tests): Add archive-manager action and verification infra.#2233
quinntaylormitchell wants to merge 6 commits intoy-scope:mainfrom
quinntaylormitchell:testing-redesign-7

quinntaylormitchell commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

quinntaylormitchell commented Apr 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Validation performed

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

quinntaylormitchell commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading