[container_benchmark] Pre pull container images and rework reports #854

Honny1 · 2025-10-29T20:46:09Z

Summary by CodeRabbit

New Features
- Pull and cache configured container images as tarballs and optionally clean them up.
- New shared benchmark tasks for image pull/load, artifacts directory creation, benchmark run, and holistic container resource cleanup.
Improvements
- Reports and plots now emphasize 95th‑percentile execution time and show jitter; plotting visuals and grouping improved.
- Remote helpers made directory/file management more reliable across platforms.
Refactor
- Consolidated cross‑platform command/remote execution flow and moved per-benchmark steps into shared task templates.

openshift-ci · 2025-10-29T20:46:15Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kpouget for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-10-29T20:46:18Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

coderabbitai · 2025-10-29T20:46:34Z

Walkthrough

Adds configurable container-image archival (pull and save as tar with optional cleanup), centralizes Ansible benchmark flows into shared cross-platform tasks, refactors remote command builders and remote file APIs, and migrates visualization metrics to 95th‑percentile execution time with jitter and percentile-aware plotting.

Changes

Cohort / File(s)	Change Summary
Container Image Configuration `projects/container_bench/testing/config.yaml`, `projects/container_bench/testing/config_manager.py`	New config keys and accessor for container image handling (`pull_images`, `images`, `dir`) and cleanup flag `container_images`.
Container Engine `projects/container_bench/testing/container_engine.py`	Added `store_container_images_as_tar()` to pull/save images as tarballs; renamed `rm_image` → `rm_images` (accepts a list); uses `Path` for tar paths.
Prepare & Utils Integration `projects/container_bench/testing/prepare.py`, `projects/container_bench/testing/utils.py`	Invoke image archival in Docker/Podman prepare flows; added `cleanup_container_images()` and replaced inline remote file removals with `remote_access.remove_remote_file`.
Remote Access & Builders `projects/container_bench/testing/remote_access.py`, `projects/container_bench/testing/platform_builders.py`	Consolidated platform command building via PlatformFactory; added `create_remote_directory()` and `remove_remote_file()`; extended PlatformCommandBuilder with chdir/rm/mkdir/exists/entrypoint/check implementations.
Ansible Shared Tasks Added `projects/container_bench/toolbox/shared_tasks/*.yml`	New shared tasks: `create_artifacts_dir.yml`, `map_variables.yml`, `pull_image.yml`, `cleanup_container_resources.yml`, `run_benchmark.yml`, `unix_default_benchmark_body.yml`, `windows_default_benchmark_body.yml`.
Ansible Role Refactoring `projects/container_bench/toolbox/container_bench_/tasks/`	Replaced inline per-role bootstraps with includes of shared tasks; mains now map variables and include unified benchmark bodies.
Benchmark Main Tasks `projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml`, `.../helloworld.../tasks/main.yml`, `.../image_build_large_build_context.../tasks/main.yaml`	Reworked mains to assert required vars (no-op guard), include `map_variables.yml`, set `image_to_pull`, and include shared benchmark body templates; moved per-OS prepare/cleanup setup into mains.
Role Task Removals multiple `projects/container_bench/toolbox/container_bench_//tasks/{unix.yml,windows.yml}`	Removed previous inline task bodies (artifact creation, prepare, benchmark, cleanup); replaced by shared includes.
Shared Cleanup Task `projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml`	New cross-platform container cleanup sequence (stop/rm images/volumes/prune) using `binary_path`, `additional_args`, and `rootfull`.
Visualization: Metrics & Plotting `projects/container_bench/visualizations/benchmark/plotting/metrics.py`, `.../comparison_report.py`, `.../report.py`, `.../utils/shared.py`, `.../utils/config.py`	Switched from `exec_time` to `execution_time_95th_percentile` + `jitter`; added plotting color constants; percentile-aware traces; updated host/engine card signatures and Linux-specific field filtering.
Visualization: Parser Changes `projects/container_bench/visualizations/benchmark/store/parsers.py`	Replaced scalar averages with percentile/jitter computations and per-series percentile summaries for CPU/memory/network/disk.
Visualization: Removed Module `projects/container_bench/visualizations/benchmark/plotting/utils/metrics.py`	Deleted legacy metric-aggregation module and its functions/constants.
Helpers `projects/matrix_benchmarking/visualizations/helpers/plotting/units.py`	Added guards returning "N/A" for `None` inputs in `human_readable_size()` and `format_duration()`.
Small Task Edits `projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/`, `.../capture_system_state/tasks/`	Minor task renames and replaced inline artifact dir creation with `include_tasks` to shared `create_artifacts_dir.yml`.
Test Filter `projects/container_bench/testing/test_container_bench.py`	Added runtime filtering to skip benchmarks for platforms listed in `test_config['platforms_to_skip']`.

Sequence Diagram(s)

sequenceDiagram
    participant Prepare as Prepare Flow
    participant ContainerEngine as Container Engine
    participant Remote as Remote Host
    participant Archive as Images Dir

    Prepare->>ContainerEngine: store_container_images_as_tar()
    ContainerEngine->>ContainerEngine: read container_images config
    ContainerEngine->>Remote: create/ensure images dir
    loop each image
        ContainerEngine->>Remote: pull or load image
        ContainerEngine->>Remote: save image as tar -> Archive
    end
    ContainerEngine->>ContainerEngine: rm_images(list_of_pulled_images)
    Prepare->>Prepare: continue prepare/cleanup flow

sequenceDiagram
    participant Main as role main.yml
    participant Map as map_variables.yml
    participant Body as unix/windows_default_benchmark_body.yml
    participant Shared as shared_tasks (create/pull/run/cleanup)

    Main->>Map: include map_variables.yml
    Map-->>Main: set binary_path, additional_args, exec_time_path, rootfull
    Main->>Body: include appropriate default benchmark body
    Body->>Shared: include create_artifacts_dir.yml
    Body->>Shared: include cleanup_container_resources.yml (pre)
    alt image_to_pull defined
        Body->>Shared: include pull_image.yml
    end
    Body->>Shared: execute prepare_commands (if any)
    Body->>Shared: include run_benchmark.yml
    Body->>Shared: execute cleanup_commands (if any)
    Body->>Shared: include cleanup_container_resources.yml (post)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas needing extra attention:
- container_engine.store_container_images_as_tar(): filename sanitization, remote quoting/escaping, idempotence and error handling.
- shared_tasks/pull_image.yml: cross-OS tar existence detection and correct load vs pull branching.
- Deletion of plotting/utils/metrics.py: ensure all callers updated and parser outputs match plotting inputs.
- store/parsers.py percentile/jitter calculations: correctness for small samples and edge cases.
- remote_access/platform_builders: entrypoint script generation, escaping/quoting semantics, and exists/check handling.
- Ansible include paths and map_variables variable mapping across roles.

Possibly related PRs

[container_benchmark] Linux support #845 — overlaps on ConfigManager and container-engine image handling; likely related to added config keys and accessor.
[container_bench] Container Benchmark #765 — prior refactor touching container_bench testing components and image handling; strong code-level relation.
[container_benchmark] Matrix benchmarking #826 — modifies container engine lifecycle handling and Podman specifics; related to container image/store changes.

Suggested labels

ok-to-test

Poem

🐇 I pulled the images, wrapped each in tar,
hopped through shared tasks both near and far,
percentiles dance with jittery cheer,
builders and remotes hop in good order here,
carrots for tests — the archive is clear! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title '[container_benchmark] Pre pull container images and rework reports' directly aligns with the primary changes in the changeset. The raw summary shows extensive container image pre-pulling functionality added across multiple files (config.yaml, config_manager.py, container_engine.py, prepare.py, utils.py), and significant report reworking in visualization files (comparison_report.py, metrics.py, shared.py, parsers.py). The title is concise, clear, and accurately captures the main intent of the pull request.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Honny1 · 2025-10-29T21:08:44Z

Ansible Lint failure fix: ansible/ansible-lint#4829

coderabbitai

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

projects/container_bench/testing/container_engine.py (1)

130-139: Quote image names in rm command.

The images are joined with spaces without proper quoting. If image names contain spaces or special characters, the command will fail or behave unexpectedly.

Apply this diff to properly quote image names:
     def rm_images(self, images):
+        import shlex
         ret = remote_access.run_with_ansible_ssh_conf(
             self.base_work_dir,
-            f"{self.get_command()} image rm {' '.join(images)}",
+            f"{self.get_command()} image rm {' '.join(shlex.quote(img) for img in images)}",
             check=False,
             capture_stdout=True,
             capture_stderr=True,

projects/container_bench/visualizations/benchmark/plotting/utils/config.py (1)

173-178: Guard against missing percentile/jitter attributes

Accessing metrics.execution_time_95th_percentile and metrics.execution_time_jitter directly will raise AttributeError for every legacy results set produced before this PR (they only expose exec_time). That breaks report generation on existing archives. Please fall back to the old fields (and skip injecting the key when nothing is available) to keep backward compatibility.

Apply this diff:

-        data.update({
-            "execution_time_95th_percentile": metrics.execution_time_95th_percentile,
-            "jitter": metrics.execution_time_jitter,
-            "command": metrics.command,
-            "timestamp": metrics.timestamp,
-            "runs": entry.settings.__dict__.get("benchmark_runs", 1)
-        })
+        execution_time_95th = getattr(
+            metrics,
+            "execution_time_95th_percentile",
+            getattr(metrics, "exec_time", None),
+        )
+        execution_time_jitter = getattr(metrics, "execution_time_jitter", None)
+
+        payload = {
+            "command": metrics.command,
+            "timestamp": metrics.timestamp,
+            "runs": entry.settings.__dict__.get("benchmark_runs", 1),
+        }
+        if execution_time_95th is not None:
+            payload["execution_time_95th_percentile"] = execution_time_95th
+        if execution_time_jitter is not None:
+            payload["jitter"] = execution_time_jitter
+
+        data.update(payload)

🧹 Nitpick comments (9)

projects/matrix_benchmarking/visualizations/helpers/plotting/units.py (2)
13-14: Good defensive guard, but update the docstring.

The None check prevents a TypeError crash and returns an appropriate "N/A" string for visualization. However, the docstring at line 8 documents the parameter as value (int) but doesn't mention that None is now a valid input.

Apply this diff to update the docstring:
     Args:
-        value (int): The byte value to convert.
+        value (int or None): The byte value to convert, or None for missing data.
37-38: Good defensive guard, but update the docstring.

The None check prevents a TypeError crash and returns an appropriate "N/A" string for visualization. However, the docstring at line 32 documents the parameter as seconds (float) but doesn't mention that None is now a valid input.

Apply this diff to update the docstring:
     Args:
-        seconds (float): The duration in seconds.
+        seconds (float or None): The duration in seconds, or None for missing data.
projects/container_bench/testing/config.yaml (1)

103-108: Consider adding image pull retry configuration.

The current configuration doesn't include retry logic for image pulls, which could fail due to transient network issues. Consider adding optional retry parameters.
projects/container_bench/visualizations/benchmark/plotting/metrics.py (1)
273-304: Verify the initial sort order.

Line 281 sorts the data by ["ts", "name"] with ascending=False (descending), but then lines 287 and 296 sort each group by "ts" in ascending order (default). The initial descending sort appears redundant or potentially incorrect, since the within-group ascending sort overrides it.

Consider whether line 281 should use ascending=True or if the descending sort serves a specific purpose not immediately apparent.
# Suggested clarification: if ascending order is intended, apply this diff
# diff
# -    df = df.sort_values(by=["ts", "name"], ascending=False)
# +    df = df.sort_values(by=["ts", "name"], ascending=True)
Otherwise, if the descending sort is intentional, please document its purpose to aid future maintainers.
projects/container_bench/visualizations/benchmark/store/parsers.py (1)
90-110: Consider adding strict=True to zip() calls for safety.

In Python 3.10+, using strict=True ensures all iterables have the same length and helps catch data inconsistencies early. This is especially useful when processing parallel metric data.

Apply this diff to all zip() calls in the metric calculation section:
-    metric.cpu = [_calculate_usage_metric(cpu) for cpu in zip(*cpu_usages)]
+    metric.cpu = [_calculate_usage_metric(cpu) for cpu in zip(*cpu_usages, strict=True)]
     metric.execution_time_95th_percentile = _calculate_percentile(execution_times, 95)
     metric.execution_time_jitter = _calculate_jitter(execution_times)
-    metric.memory = [_calculate_usage_metric(memory) for memory in zip(*memory_usages)]
+    metric.memory = [_calculate_usage_metric(memory) for memory in zip(*memory_usages, strict=True)]
 
     network_send = [
-        _calculate_usage_metric(send) for send in zip(*network_send_usages)
+        _calculate_usage_metric(send) for send in zip(*network_send_usages, strict=True)
     ]
     network_recv = [
-        _calculate_usage_metric(recv) for recv in zip(*network_recv_usages)
+        _calculate_usage_metric(recv) for recv in zip(*network_recv_usages, strict=True)
     ]
     metric.network = dict(send=network_send, recv=network_recv)
 
     disk_read = [
-        _calculate_usage_metric(read) for read in zip(*disk_read_usages)
+        _calculate_usage_metric(read) for read in zip(*disk_read_usages, strict=True)
     ]
     disk_write = [
-        _calculate_usage_metric(write) for write in zip(*disk_write_usages)
+        _calculate_usage_metric(write) for write in zip(*disk_write_usages, strict=True)
     ]
projects/container_bench/toolbox/shared_tasks/pull_image.yml (3)
45-65: Consider simplifying tarfile existence check.

The current implementation works but is verbose with separate registration and normalization steps. Consider using a unified approach.

Alternative approach using set_fact with conditional:
- name: Check if image tarfile exists
  stat:
    path: "{{ image_tarfile }}"
  register: tarfile_stat
  when: ansible_os_family != 'Windows'

- name: Check if image tarfile exists (Windows)
  ansible.windows.win_stat:
    path: "{{ image_tarfile }}"
  register: tarfile_stat
  when: ansible_os_family == 'Windows'

- name: Set unified tarfile status
  set_fact:
    tarfile_exists: "{{ tarfile_stat.stat.exists | default(false) }}"
This reuses the same variable name and eliminates the OS-specific normalization tasks.

67-75: Document optional rootfull variable dependency.

The rootfull variable is used for privilege escalation but isn't listed in the required variables validation (lines 10-16). While default(omit) handles undefined cases, it would be clearer to document this as an optional variable in the header comments.

Update the usage comment at the top of the file:
 # Usage:
 # - include_tasks: "../../shared_tasks/pull_image.yml"
 #   vars:
 #     image_name: "quay.io/example/image:tag"
-# Requires: binary_path, additional_args, image_name variables
+# Requires: binary_path, image_name variables
+# Optional: additional_args, rootfull (for privilege escalation)
 # Note: Image names are sanitized by replacing all special characters with '_'
77-84: LGTM! Consider adding retry logic for registry pulls.

The pull logic correctly falls back to registry when tarfile is absent. Consider adding retry logic if network reliability is a concern.

Optional enhancement for resilience:
- name: Pull image from registry (Unix/Linux/macOS)
  become: "{{ rootfull | default(omit) }}"
  shell: "{{ binary_path }} {{ additional_args | default('') }} pull {{ image_name }}"
  when: ansible_os_family != 'Windows' and not tarfile_exists
  retries: 3
  delay: 5
  register: pull_result
  until: pull_result.rc == 0
projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (1)
105-119: Consider using a more robust Linux detection method.

The Linux detection uses a simple substring check on OS_version. While this works, it could be more robust.

Consider using the existing detect_linux_system utility from shared.py:
+from .utils.shared import detect_linux_system
...
     # Filter out Client_version for Linux systems
     system_info = shared_info.get("system", {})
-    is_linux = "linux" in system_info.get("OS_version", "").lower()
+    is_linux = detect_linux_system(system_info)
     filtered_engine_mappings = ENGINE_INFO_FIELD_MAPPINGS.copy()
Note: This utility is available in shared.py at lines 232-233.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 538d3f6 and 55257a3.

📒 Files selected for processing (35)

projects/container_bench/testing/config.yaml (2 hunks)
projects/container_bench/testing/config_manager.py (3 hunks)
projects/container_bench/testing/container_engine.py (3 hunks)
projects/container_bench/testing/prepare.py (3 hunks)
projects/container_bench/testing/utils.py (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/main.yaml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/unix.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/windows.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/darwin.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/linux.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/windows.yml (1 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml (1 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml (1 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml (1 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/create_artifacts_dir.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/map_variables.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/pull_image.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/run_benchmark.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/unix_default_benchmark_body.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/windows_default_benchmark_body.yml (1 hunks)
projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (4 hunks)
projects/container_bench/visualizations/benchmark/plotting/metrics.py (6 hunks)
projects/container_bench/visualizations/benchmark/plotting/report.py (2 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/config.py (2 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/metrics.py (0 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (4 hunks)
projects/container_bench/visualizations/benchmark/store/parsers.py (3 hunks)
projects/matrix_benchmarking/visualizations/helpers/plotting/units.py (2 hunks)

💤 Files with no reviewable changes (7)

projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/unix.yml
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/unix.yml
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/unix.yml
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/windows.yml
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/windows.yml
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/windows.yml
projects/container_bench/visualizations/benchmark/plotting/utils/metrics.py

🧰 Additional context used

🧬 Code graph analysis (7)

projects/container_bench/testing/utils.py (3)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

get_container_images_config (201-209)

projects/container_bench/testing/remote_access.py (2)

exists (300-304)

prepare (13-82)

projects/container_bench/testing/prepare.py (2)

prepare (92-110)

remove_remote_file (11-21)

projects/container_bench/visualizations/benchmark/plotting/report.py (1)

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (1)

create_engine_info_items (217-229)

projects/container_bench/testing/container_engine.py (2)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

get_container_images_config (201-209)

projects/container_bench/testing/remote_access.py (2)

exists (300-304)

run_with_ansible_ssh_conf (236-264)

projects/container_bench/testing/prepare.py (3)

projects/container_bench/testing/utils.py (1)

cleanup_container_images (99-106)

projects/container_bench/testing/remote_access.py (1)

prepare (13-82)

projects/container_bench/testing/container_engine.py (1)

store_container_images_as_tar (79-103)

projects/container_bench/visualizations/benchmark/plotting/metrics.py (1)

projects/container_bench/visualizations/benchmark/plotting/__init__.py (1)

register (6-9)

projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (3)

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (4)

create_host_info_card (165-173)

create_engine_info_card (176-184)

create_table_header (65-67)

create_execution_time_content (87-109)

projects/container_bench/testing/config_manager.py (1)

is_linux (94-95)

projects/container_bench/visualizations/benchmark/plotting/utils/config.py (1)

generate_display_config_label (299-318)

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (2)

projects/matrix_benchmarking/visualizations/helpers/plotting/units.py (1)

format_duration (28-46)

projects/matrix_benchmarking/visualizations/helpers/plotting/html.py (1)

info_card (6-34)

🪛 Ruff (0.14.2)

projects/container_bench/visualizations/benchmark/store/parsers.py

90-90: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

93-93: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

96-96: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

99-99: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

104-104: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

107-107: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

108-108: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🔇 Additional comments (40)

projects/container_bench/testing/config.yaml (2)

159-159: Good default: cleanup disabled.

Setting container_images: false by default is a safe choice that prevents accidental deletion of pre-pulled images during development.

103-108: No cross-platform issue exists; the code already handles this correctly.

The config uses forward slashes in a string that is passed to pathlib.Path(), which is the correct and standard approach for cross-platform path handling. The pathlib module automatically converts path separators to the appropriate format for the host OS—forward slashes on Unix/Linux/macOS and backslashes on Windows. This pattern is consistently used throughout the codebase (e.g., utils.py:102, container_engine.py:84), ensuring proper cross-platform compatibility.

Likely an incorrect or invalid review comment.

projects/container_bench/testing/utils.py (1)

99-107: LGTM! Consistent cleanup implementation.

The function follows the established pattern of other cleanup utilities in this file, uses proper path handling with Path, and includes appropriate logging.

projects/container_bench/testing/prepare.py (1)

52-54: LGTM! Proper cleanup integration.

The conditional cleanup for container images follows the same pattern as other cleanup operations in this function.

projects/container_bench/testing/config_manager.py (3)

60-62: LGTM! Config keys follow established conventions.

The new configuration keys for container images follow the same naming pattern as other configuration keys in this file.

200-209: LGTM! Accessor follows established pattern.

The get_container_images_config() method follows the same structure as other configuration accessors in this class.

223-224: LGTM! Cleanup config properly extended.

The container_images cleanup configuration is correctly integrated into the extended cleanup config dictionary.

projects/container_bench/testing/container_engine.py (2)

1-1: LGTM! Appropriate import for path handling.

The Path import from pathlib is correctly added to support cross-platform path operations.

130-139: All call sites verified as correct.

No remaining calls to rm_image exist in the codebase. The single call site at line 103 correctly invokes self.rm_images(images) with the plural parameter as expected. The new method definition at line 130 aligns with the call.

projects/container_bench/visualizations/benchmark/plotting/metrics.py (4)

1-1: LGTM!

The additional import and constants improve code organization and maintainability. The color constants with explicit RGBA values make the plotting code more readable.

Also applies to: 8-8, 33-41

73-74: LGTM!

Good refactoring to use the _to_mb_s helper function, eliminating code duplication and ensuring consistent conversion logic across network and disk metrics.

Also applies to: 98-99

232-271: Consistent refactoring for single metrics.

The refactoring follows the same pattern as _create_dual_metric_traces, properly separating trace creation and returning a list of traces. The same mixed-type concern from the dual metrics applies here (line 234).

173-231: The code is safe—mixed data types are not possible.

The concern about homogeneity is unfounded. The _to_mb_s function (lines 48–59) guarantees that all values returned are SimpleNamespace objects, regardless of input type. Every metric value in the dataframe columns (send, recv, read, write) is transformed through _to_mb_s before being added to the dataframe, ensuring all values are homogeneous SimpleNamespace objects with percentile_50th, percentile_75th, and percentile_25th attributes. The .any() check on line 189 will always evaluate to True, making the else branch (lines 217–228) unreachable dead code—though removing it is optional since it poses no safety risk.

Likely an incorrect or invalid review comment.

projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/main.yaml (1)

2-4: Task rename LGTM

The updated task label now matches the command it prepares—looks good to me.

projects/container_bench/toolbox/shared_tasks/map_variables.yml (1)

8-13: Shared alias mapping looks solid

Centralizing the long→short variable aliases keeps the downstream tasks cleaner; no issues here.

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/linux.yml (1)

2-3: Nice reuse of the shared artifacts task

Including the shared directory creator keeps the Linux flow aligned with the other platforms—looks good.

projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/windows.yml (1)

2-3: Shared artifacts setup confirmed

Pulling in the shared artifacts creator on Windows keeps the workflow consistent—no concerns here.

projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/unix.yml (1)

2-3: Consistent artifacts provisioning

Hooking into the shared artifacts task for Unix keeps setup DRY—looks good.

projects/container_bench/toolbox/shared_tasks/create_artifacts_dir.yml (1)

6-17: Shared artifacts helper looks good

The cross-platform directory creation with appropriate when guards is clean and idempotent—no issues spotted.

projects/container_bench/visualizations/benchmark/plotting/report.py (2)

50-59: Good call trimming client info on Linux

Skipping Client_version and suppressing provider info on Linux keeps the report focused on fields we actually collect—nice cleanup.

109-127: Plot gating matches the new percentile metric

Switching the plot threshold to execution_time_95th_percentile keeps the visuals in sync with the updated metrics pipeline—looks correct.

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/darwin.yml (1)

2-3: LGTM!

The refactor to use a shared task for artifacts directory creation improves code reusability and maintainability.

projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml (2)

2-9: LGTM!

The when: false pattern effectively documents required variables for IDE/linter support without impacting execution. This is a useful technique for maintainability.

27-69: LGTM!

The OS-specific cleanup and prepare commands are well-structured with appropriate platform-specific syntax (bash for Unix/Linux, PowerShell for Windows).

projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (2)

7-10: LGTM!

The validation correctly checks if binary_path is defined using when: binary_path is not defined.

12-101: LGTM!

The comprehensive cleanup logic properly handles both Unix/Linux and Windows platforms with appropriate error suppression (|| true, 2>/dev/null, 2>$null || exit 0) to ensure the playbook doesn't fail when there are no resources to clean.

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/windows.yml (1)

2-3: LGTM!

Consistent with the darwin.yml refactor, this change improves code reusability by delegating to a shared artifacts directory creation task.

projects/container_bench/toolbox/shared_tasks/windows_default_benchmark_body.yml (1)

15-39: LGTM!

The benchmark orchestration flow (artifacts creation → pre-cleanup → optional image pull → prepare → benchmark → cleanup → post-cleanup) is well-structured and follows a logical sequence.

projects/container_bench/toolbox/shared_tasks/unix_default_benchmark_body.yml (1)

16-42: LGTM!

The Unix/Linux benchmark orchestration mirrors the Windows implementation with appropriate platform-specific commands and privilege elevation via the become directive.

projects/container_bench/visualizations/benchmark/store/parsers.py (1)

147-182: LGTM!

The new helper functions for calculating usage metrics, percentiles, and jitter are well-implemented with appropriate null/empty checks. The shift to percentile-based metrics (95th percentile, 75th/50th/25th percentiles) and jitter calculation provides more robust statistical analysis compared to simple averages.

projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml (2)

2-9: Static assertion pattern for documentation.

The when: false assertion serves as documentation for required variables and can be caught by static analysis tools like ansible-lint. This pattern is acceptable for variable documentation.

11-13: No issues found — the referenced file exists and properly maps variables.

Verification confirms:

map_variables.yml exists at the correct path: projects/container_bench/toolbox/shared_tasks/map_variables.yml

It correctly maps all long variable names (container_bench_exec_benchmark_*) to short aliases (binary_path, additional_args, exec_time_path, rootfull)

All long variable names are defined in defaults/main/config.yml

The variable mapping chain works end-to-end as intended

projects/container_bench/toolbox/shared_tasks/pull_image.yml (2)

18-33: LGTM! OS-aware directory creation.

The directory creation properly handles both Unix-like systems and Windows with appropriate modules and permissions.

35-43: LGTM! Proper image name sanitization.

The regex sanitization correctly handles special characters in image names (e.g., registry.io/image:tag becomes registry_io_image_tag), making safe filenames for both Unix and Windows.

projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (3)

32-53: LGTM! Proper migration to 95th percentile metrics.

The performance delta calculation correctly uses execution_time_95th_percentile with appropriate filtering for zero/missing values. This provides more meaningful performance comparisons than raw execution time.

127-179: LGTM! Consistent migration to 95th percentile metrics throughout the table.

The performance comparison table correctly uses execution_time_95th_percentile and jitter throughout, maintaining internal consistency for sorting, fastest detection, and display.

224-262: LGTM! Plot gating uses consistent timing metrics.

The system monitoring plots are correctly gated on execution_time_95th_percentile, ensuring plots are only shown for benchmarks that run long enough for monitoring data to be meaningful.

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (3)

125-184: LGTM! Backward-compatible API improvements.

The info card functions now support customization via title and field_mappings parameters while maintaining backward compatibility with sensible defaults. The None guard in create_summary_info_card is a good defensive addition.

191-192: LGTM! Consistent metric usage.

The function correctly uses execution_time_95th_percentile to determine which benchmarks ran long enough for meaningful plots.

87-109: LGTM! Clear improvement in execution time display.

The updated function signature and implementation better represent statistical timing metrics with the 95th percentile and jitter. Verification confirms the single call site at comparison_report.py:158 uses the new signature correctly.

projects/container_bench/testing/container_engine.py

projects/container_bench/testing/prepare.py

projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml

projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml

projects/container_bench/toolbox/shared_tasks/run_benchmark.yml

projects/container_bench/toolbox/shared_tasks/unix_default_benchmark_body.yml

projects/container_bench/toolbox/shared_tasks/windows_default_benchmark_body.yml

projects/container_bench/visualizations/benchmark/plotting/metrics.py

projects/container_bench/visualizations/benchmark/store/parsers.py

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml (1)
15-21: Use the image_to_pull variable in the benchmark command.

The image_to_pull variable is defined (line 17) but not used in the benchmark_command (line 21), which hardcodes the image name. This creates a maintenance burden if the image needs to change and was previously flagged in review.

Apply this diff to reference the variable consistently:
     image_to_pull: "quay.io/podman/hello"
     prepare_commands: "" # Image is pulled using shared pull_image task below
     cleanup_commands: "" # Cleanup handled by shared tasks
     benchmark_command: |
-      {{ binary_path }} {{ additional_args }} run --rm quay.io/podman/hello
+      {{ binary_path }} {{ additional_args }} run --rm {{ image_to_pull }}

🧹 Nitpick comments (1)

projects/matrix_benchmarking/visualizations/helpers/plotting/units.py (1)

13-14: LGTM! Consider updating docstrings to reflect None handling.

The None guards are a good defensive addition for visualization code, ensuring graceful handling of missing data with consistent "N/A" formatting across both functions.

Consider updating the docstrings to document that None is now an acceptable input value for both functions.

For example:

 def human_readable_size(value):
     """Convert a byte value to a human-readable string.
 
     Args:
-        value (int): The byte value to convert.
+        value (int | None): The byte value to convert, or None for missing data.
 
     Returns:
-        str: The human-readable string representation of the byte value.
+        str: The human-readable string representation, or "N/A" if value is None.
     """

 def format_duration(seconds):
     """Convert a duration in seconds to a human-readable string.
 
     Args:
-        seconds (float): The duration in seconds.
+        seconds (float | None): The duration in seconds, or None for missing data.
 
     Returns:
-        str: The human-readable string representation of the duration.
+        str: The human-readable string representation, or "N/A" if seconds is None.
     """

Also applies to: 37-38

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55257a3 and cb81cc6.

📒 Files selected for processing (36)

projects/container_bench/testing/config.yaml (2 hunks)
projects/container_bench/testing/config_manager.py (3 hunks)
projects/container_bench/testing/container_engine.py (3 hunks)
projects/container_bench/testing/prepare.py (3 hunks)
projects/container_bench/testing/remote_access.py (1 hunks)
projects/container_bench/testing/utils.py (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/main.yaml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/unix.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/windows.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/darwin.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/linux.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/windows.yml (1 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml (1 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml (1 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml (1 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/create_artifacts_dir.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/map_variables.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/pull_image.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/run_benchmark.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/unix_default_benchmark_body.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/windows_default_benchmark_body.yml (1 hunks)
projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (4 hunks)
projects/container_bench/visualizations/benchmark/plotting/metrics.py (6 hunks)
projects/container_bench/visualizations/benchmark/plotting/report.py (2 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/config.py (2 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/metrics.py (0 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (4 hunks)
projects/container_bench/visualizations/benchmark/store/parsers.py (3 hunks)
projects/matrix_benchmarking/visualizations/helpers/plotting/units.py (2 hunks)

💤 Files with no reviewable changes (7)

projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/unix.yml
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/unix.yml
projects/container_bench/visualizations/benchmark/plotting/utils/metrics.py
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/unix.yml
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/windows.yml
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/windows.yml
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/windows.yml

🚧 Files skipped from review as they are similar to previous changes (14)

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/linux.yml
projects/container_bench/testing/config.yaml
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/darwin.yml
projects/container_bench/toolbox/shared_tasks/unix_default_benchmark_body.yml
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml
projects/container_bench/visualizations/benchmark/plotting/report.py
projects/container_bench/toolbox/shared_tasks/pull_image.yml
projects/container_bench/testing/prepare.py
projects/container_bench/toolbox/shared_tasks/run_benchmark.yml
projects/container_bench/toolbox/shared_tasks/map_variables.yml
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml
projects/container_bench/testing/container_engine.py
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/main.yaml
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/windows.yml

🧰 Additional context used

🧬 Code graph analysis (5)

projects/container_bench/testing/utils.py (3)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

get_container_images_config (201-209)

projects/container_bench/testing/remote_access.py (2)

exists (300-304)

prepare (13-82)

projects/container_bench/testing/prepare.py (2)

prepare (92-110)

remove_remote_file (11-21)

projects/container_bench/testing/remote_access.py (3)

projects/remote/lib/remote_access.py (1)

mkdir (211-220)

projects/container_bench/testing/prepare.py (1)

prepare (92-110)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

is_windows (90-91)

projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (2)

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (4)

create_host_info_card (165-173)

create_engine_info_card (176-184)

create_table_header (65-67)

create_execution_time_content (87-109)

projects/container_bench/visualizations/benchmark/plotting/utils/config.py (1)

generate_display_config_label (299-318)

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (2)

projects/matrix_benchmarking/visualizations/helpers/plotting/units.py (1)

format_duration (28-46)

projects/matrix_benchmarking/visualizations/helpers/plotting/html.py (1)

info_card (6-34)

projects/container_bench/visualizations/benchmark/plotting/metrics.py (1)

projects/container_bench/visualizations/benchmark/plotting/__init__.py (1)

register (6-9)

🪛 Ruff (0.14.2)

projects/container_bench/visualizations/benchmark/store/parsers.py

90-90: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

93-93: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

96-96: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

99-99: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

104-104: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

107-107: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🔇 Additional comments (16)

projects/container_bench/toolbox/shared_tasks/create_artifacts_dir.yml (1)

1-17: Well-structured shared task for cross-platform artifact directory creation.

The task appropriately isolates platform-specific path handling and module choice. The when conditions correctly route to Unix and Windows implementations.

projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/unix.yml (1)

2-3: Good refactoring to shared task.

Consolidating artifact directory creation into a shared task improves maintainability and reduces duplication.

projects/container_bench/toolbox/shared_tasks/windows_default_benchmark_body.yml (2)

6-13: Validation logic correctly fixed.

The condition uses vars[item] is not defined to properly check for variable existence by name, resolving the past review comment. The loop correctly validates all three required variables.

15-39: Well-orchestrated benchmark lifecycle.

The task progression through artifact creation, cleanup, optional image pull, preparation, benchmark execution, and final cleanup follows a logical and maintainable pattern via shared task includes.

projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml (1)

2-9: Good addition of validation assertion.

The assert task with when: false provides type-checking documentation without blocking execution, which is a cleaner pattern than runtime validation.

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/windows.yml (1)

2-3: Good consolidation of artifact directory creation.

This change aligns with the shared-task pattern introduced elsewhere and eliminates duplication across platform-specific capture tasks.

projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml (4)

2-9: Consistent validation pattern.

The assert task with when: false matches the approach in other benchmark files and provides good type-checking documentation.

15-21: Platform-aware path construction is well-handled.

The conditional expression for build_context_path correctly routes to OS-appropriate paths and will be used consistently in subsequent commands.

23-25: Good practice: benchmark_command uses variable substitution.

Unlike the helloworld benchmark, this correctly references {{ binary_path }} and {{ additional_args }} variables rather than hardcoding values.

42-69: Platform-specific prepare and cleanup commands are well-structured.

Both Unix/Linux and Windows implementations generate equivalent 10×1GB random files using appropriate OS-specific tools (dd vs. fsutil). The Dockerfile content and build context paths are correctly parameterized.

projects/container_bench/visualizations/benchmark/plotting/metrics.py (2)

48-60: Past issue resolved!

The previous concern about setting percentile_75th=0 and percentile_25th=0 for raw numeric inputs has been correctly fixed. All three percentiles now equal val for single-value inputs, ensuring no incorrect jitter visualization.

286-299: Grouping and sorting logic is correct.

The approach of grouping by name and sorting by timestamp within each group ensures proper trace creation and chronological ordering. This correctly handles multiple series in the same plot.

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (4)

87-109: API change is clear and consistent.

The function signature and implementation correctly reflect the shift to 95th-percentile metrics with jitter display. The formatting is clear and the conditional fastest indicator works as expected.

156-157: Good defensive guard added.

Returning None when info is falsy prevents downstream errors and makes the function more robust.

165-184: Enhanced flexibility with optional field mappings.

The addition of optional field_mappings parameters with sensible defaults improves reusability without breaking existing callers. The default mappings are appropriately scoped as module-level constants.

134-146: Accurate labeling for 95th-percentile metric.

The label "Execution Time (95th Percentile)" accurately reflects the metric being displayed and aligns with the PR objectives. The jitter display provides useful context about variability.

projects/container_bench/testing/utils.py

projects/container_bench/visualizations/benchmark/plotting/comparison_report.py

projects/container_bench/visualizations/benchmark/plotting/metrics.py

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

projects/container_bench/testing/platform_builders.py (1)

57-57: Consider whether escaping environment variable keys is necessary.

While escaping the key using escape_powershell_single_quote(k) is defensive, PowerShell environment variable names typically consist of alphanumeric characters and underscores. Escaping keys is harmless but may be unnecessary unless you expect special characters in variable names.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cb81cc6 and 714bbdd.

📒 Files selected for processing (6)

projects/container_bench/testing/platform_builders.py (4 hunks)
projects/container_bench/testing/prepare.py (4 hunks)
projects/container_bench/testing/remote_access.py (4 hunks)
projects/container_bench/testing/test_container_bench.py (1 hunks)
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/pull_image.yml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml
projects/container_bench/testing/prepare.py
projects/container_bench/toolbox/shared_tasks/pull_image.yml

🧰 Additional context used

🧬 Code graph analysis (2)

projects/container_bench/testing/test_container_bench.py (1)

projects/container_bench/testing/config_manager.py (1)

get_test_config (128-143)

projects/container_bench/testing/remote_access.py (4)

projects/container_bench/testing/platform_builders.py (27)

PlatformFactory (188-194)

create_command_builder (190-194)

build_env_command (8-9)

build_env_command (49-58)

build_env_command (139-144)

build_env_command (197-199)

build_chdir_command (16-17)

build_chdir_command (101-104)

build_chdir_command (149-152)

build_entrypoint_script (36-37)

build_entrypoint_script (121-132)

build_entrypoint_script (167-182)

get_shell_command (32-33)

get_shell_command (118-119)

get_shell_command (164-165)

build_exists_command (28-29)

build_exists_command (115-116)

build_exists_command (161-162)

check_exists_result (40-41)

check_exists_result (134-135)

check_exists_result (184-185)

build_mkdir_command (24-25)

build_mkdir_command (112-113)

build_mkdir_command (158-159)

build_rm_command (20-21)

build_rm_command (106-110)

build_rm_command (154-156)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

is_windows (90-91)

projects/container_bench/testing/utils.py (1)

get_tmp_fd (16-28)

projects/remote/lib/remote_access.py (1)

mkdir (211-220)

🪛 Ruff (0.14.2)

projects/container_bench/testing/platform_builders.py

146-146: Unused method argument: service_name

(ARG002)

146-146: Unused method argument: binary_path

(ARG002)

🔇 Additional comments (10)

projects/container_bench/testing/platform_builders.py (6)

3-3: LGTM!

The shlex import is appropriately added to support secure command construction in the Unix command builder implementations.

15-41: LGTM!

The new abstract methods provide a comprehensive cross-platform command construction interface covering directory operations, file removal, existence checks, and shell selection.

44-45: LGTM!

The PowerShell single-quote escaping implementation is correct—PowerShell escapes single quotes by doubling them.

101-135: LGTM!

The Windows command builder implementations are correct:

build_chdir_command: Properly uses Set-Location with escaped paths

build_rm_command: Correctly uses Remove-Item with appropriate flags

build_mkdir_command: Uses New-Item with -Force for idempotency

build_exists_command: Uses Test-Path which returns "True" or "False"

get_shell_command: Returns powershell.exe -Command -

build_entrypoint_script: Sets $ErrorActionPreference = "Stop" for fail-fast behavior

check_exists_result: Correctly parses the "True"/"False" stdout from Test-Path

146-147: Unused method arguments are acceptable for interface compliance.

The static analysis tool flags service_name and binary_path as unused. This is expected—the Unix implementation doesn't require the complex service-start logic needed on Windows (lines 60-99). The parameters are part of the abstract interface and must be present for polymorphism.

149-185: LGTM!

The Unix command builder implementations are correct:

build_chdir_command: Uses cd with proper quoting

build_rm_command: Uses rm -f or rm -rf with shlex.quote for safe paths

build_mkdir_command: Uses mkdir -p with shlex.quote

build_exists_command: Uses test -e with shlex.quote

get_shell_command: Returns "bash"

build_entrypoint_script: Sets proper bash options (pipefail, errexit, nounset, errtrace)

check_exists_result: Correctly checks return code 0 for test -e

projects/container_bench/testing/remote_access.py (4)

10-10: LGTM!

The PlatformFactory import enables the builder-based command construction pattern used throughout this refactor.

85-142: LGTM!

The refactored run_with_ansible_ssh_conf correctly:

Uses PlatformFactory.create_command_builder() for platform-agnostic command construction

Constructs environment, chdir, and entrypoint scripts via the builder

Handles local execution with platform-specific logging (lines 105-107)

Uses builder.get_shell_command() to select the appropriate shell (line 138)

The builder-based approach cleanly abstracts Windows/Unix differences.

145-159: LGTM!

The refactored exists function correctly:

Uses the builder's build_exists_command to generate platform-specific existence checks

Delegates result interpretation to builder.check_exists_result, which handles Windows's "True"/"False" stdout vs. Unix's return code 0

162-174: LGTM!

The new create_remote_directory function correctly:

Handles local execution with path.mkdir(parents=True, exist_ok=True)

Uses the builder's build_mkdir_command for remote execution

Ensures idempotent directory creation on both platforms

projects/container_bench/testing/platform_builders.py

projects/container_bench/testing/remote_access.py

projects/container_bench/testing/test_container_bench.py

Signed-off-by: Jan Rodák <[email protected]>

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

projects/container_bench/testing/remote_access.py (1)

105-142: Local runs ignore extra_env and chdir.

When remote_host.run_locally is true we bypass the builder logic, so commands run without the requested environment variables or working-directory change. That diverges from remote execution semantics and breaks callers that rely on those options (e.g., commands expecting HOME overrides). Please mirror the remote behavior by applying extra_env and chdir locally (e.g., set run_kwargs["env"]/run_kwargs["cwd"] or reuse the builder). (docs.python.org)

♻️ Duplicate comments (1)

projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (1)

105-117: Duplicate: Linux filtering logic.

This Client_version filtering for Linux systems duplicates the logic in report.py (lines 50-58). See the earlier comment for a refactoring suggestion to extract this to a shared helper.

🧹 Nitpick comments (6)

projects/container_bench/visualizations/benchmark/plotting/metrics.py (3)
189-192: Good fix: Using .all() prevents mixed-type issues.

Changing from .any() (in previous versions) to .all() correctly ensures that percentile bounds are only extracted when all values are SimpleNamespace objects. This addresses the previous review concern about incorrect fill areas.

However, the isinstance checks within the lambdas (lines 190-192) are now redundant. Since .all() guarantees every value is a SimpleNamespace, the else branches can never execute.

Consider simplifying the lambdas to remove redundant checks:
 if df[key].apply(lambda v: isinstance(v, types.SimpleNamespace)).all():
-    values = df[key].apply(lambda v: v.percentile_50th if isinstance(v, types.SimpleNamespace) else v)
-    upper = df[key].apply(lambda v: v.percentile_75th if isinstance(v, types.SimpleNamespace) else 0)
-    lower = df[key].apply(lambda v: v.percentile_25th if isinstance(v, types.SimpleNamespace) else 0)
+    values = df[key].apply(lambda v: v.percentile_50th)
+    upper = df[key].apply(lambda v: v.percentile_75th)
+    lower = df[key].apply(lambda v: v.percentile_25th)
234-237: Same redundancy in single metric traces.

The .all() check is correct, but the isinstance checks in the lambdas (lines 236-237) are redundant for the same reason as in the dual metric traces.

Apply the same simplification here:
 if df["usage"].apply(lambda v: isinstance(v, types.SimpleNamespace)).all():
-    values = df["usage"].apply(lambda v: v.percentile_50th if isinstance(v, types.SimpleNamespace) else v)
-    upper = df["usage"].apply(lambda v: v.percentile_75th if isinstance(v, types.SimpleNamespace) else 0)
-    lower = df["usage"].apply(lambda v: v.percentile_25th if isinstance(v, types.SimpleNamespace) else 0)
+    values = df["usage"].apply(lambda v: v.percentile_50th)
+    upper = df["usage"].apply(lambda v: v.percentile_75th)
+    lower = df["usage"].apply(lambda v: v.percentile_25th)
273-304: LGTM: Plotting logic correctly handles grouped time series.

The method properly:

Groups data by name to handle multiple series

Sorts each group chronologically before creating traces

Adds percentile range fills and median lines for percentile data

Falls back to simple line traces for raw data

The grouping and per-group sorting ensures each series is plotted with correct time ordering.

Minor optimization: Line 281 sorts the entire DataFrame by ["ts", "name"] descending, but lines 287 and 296 re-sort each group by ["ts"] ascending anyway. You can remove the initial sort since groupby doesn't guarantee order and each group is re-sorted:
-df = df.sort_values(by=["ts", "name"], ascending=False)
 fig = go.Figure()
projects/container_bench/visualizations/benchmark/plotting/report.py (1)
50-58: Consider extracting Linux filtering logic to a shared helper.

The same Client_version filtering logic for Linux systems appears in both this file and comparison_report.py (lines 105-110). While the duplication is minor, extracting it to a helper in shared.py would improve maintainability.

Example helper:
def filter_engine_mappings_for_linux(system_info):
    """Filter out Client_version for Linux systems."""
    is_linux = detect_linux_system(system_info)
    filtered = ENGINE_INFO_FIELD_MAPPINGS.copy()
    if is_linux:
        filtered = {k: v for k, v in ENGINE_INFO_FIELD_MAPPINGS.items() if k != "Client_version"}
    return filtered, is_linux
projects/container_bench/visualizations/benchmark/store/parsers.py (1)
90-109: Consider adding strict=True to zip() calls for data integrity.

The zip() calls that combine usage metrics from multiple runs (CPU, memory, network, disk) do not specify strict=True. If different runs produce time-series data of varying lengths (e.g., due to early termination or inconsistent sampling), zip() will silently truncate to the shortest sequence, potentially hiding data quality issues and producing misleading aggregate metrics.

If your project targets Python 3.10+, apply this pattern to all zip() calls:
-    metric.cpu = [_calculate_usage_metric(cpu) for cpu in zip(*cpu_usages)]
+    metric.cpu = [_calculate_usage_metric(cpu) for cpu in zip(*cpu_usages, strict=True)]
Repeat for lines 93, 96, 99, 104, and 107. This will raise a ValueError if runs have mismatched data lengths, making the issue explicit rather than silently computing incorrect percentiles.
projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (1)
148-159: Consider preserving None semantics for missing jitter data.

Line 150 converts None jitter values to 0, but this conflates "no jitter data" with "zero jitter." Since format_duration already handles None gracefully (returning "N/A"), preserving the None value would more accurately represent missing data.

Apply this diff to preserve None semantics:
-        jitter = config.get("jitter", 0)
-        jitter = jitter if jitter is not None else 0
+        jitter = config.get("jitter")
Then create_execution_time_content will receive None, and format_duration(None) will return "N/A" for display, which is semantically more accurate than displaying "0" for missing data.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 714bbdd and e65fb3b.

📒 Files selected for processing (38)

projects/container_bench/testing/config.yaml (2 hunks)
projects/container_bench/testing/config_manager.py (3 hunks)
projects/container_bench/testing/container_engine.py (3 hunks)
projects/container_bench/testing/platform_builders.py (4 hunks)
projects/container_bench/testing/prepare.py (4 hunks)
projects/container_bench/testing/remote_access.py (4 hunks)
projects/container_bench/testing/test_container_bench.py (1 hunks)
projects/container_bench/testing/utils.py (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/main.yaml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/unix.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/windows.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/darwin.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/linux.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/windows.yml (1 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml (1 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml (1 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml (1 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/create_artifacts_dir.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/map_variables.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/pull_image.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/run_benchmark.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/unix_default_benchmark_body.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/windows_default_benchmark_body.yml (1 hunks)
projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (4 hunks)
projects/container_bench/visualizations/benchmark/plotting/metrics.py (6 hunks)
projects/container_bench/visualizations/benchmark/plotting/report.py (2 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/config.py (2 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/metrics.py (0 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (4 hunks)
projects/container_bench/visualizations/benchmark/store/parsers.py (3 hunks)
projects/matrix_benchmarking/visualizations/helpers/plotting/units.py (2 hunks)

💤 Files with no reviewable changes (7)

projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/windows.yml
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/windows.yml
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/unix.yml
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/windows.yml
projects/container_bench/visualizations/benchmark/plotting/utils/metrics.py
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/unix.yml
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/unix.yml

🚧 Files skipped from review as they are similar to previous changes (19)

projects/matrix_benchmarking/visualizations/helpers/plotting/units.py
projects/container_bench/toolbox/shared_tasks/pull_image.yml
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml
projects/container_bench/toolbox/shared_tasks/windows_default_benchmark_body.yml
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/main.yaml
projects/container_bench/testing/container_engine.py
projects/container_bench/toolbox/shared_tasks/run_benchmark.yml
projects/container_bench/testing/config_manager.py
projects/container_bench/toolbox/shared_tasks/unix_default_benchmark_body.yml
projects/container_bench/testing/test_container_bench.py
projects/container_bench/toolbox/shared_tasks/create_artifacts_dir.yml
projects/container_bench/toolbox/shared_tasks/map_variables.yml
projects/container_bench/testing/config.yaml
projects/container_bench/testing/prepare.py
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/darwin.yml
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml
projects/container_bench/testing/utils.py
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/unix.yml

🧰 Additional context used

🧬 Code graph analysis (5)

projects/container_bench/visualizations/benchmark/plotting/report.py (1)

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (1)

create_engine_info_items (217-229)

projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (2)

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (4)

create_host_info_card (165-173)

create_engine_info_card (176-184)

create_table_header (65-67)

create_execution_time_content (87-109)

projects/container_bench/visualizations/benchmark/plotting/utils/config.py (1)

generate_display_config_label (299-318)

projects/container_bench/visualizations/benchmark/plotting/metrics.py (3)

projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (1)

register (56-57)

projects/container_bench/visualizations/benchmark/plotting/report.py (1)

register (23-24)

projects/container_bench/visualizations/benchmark/plotting/__init__.py (1)

register (6-9)

projects/container_bench/testing/remote_access.py (4)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

is_windows (90-91)

projects/container_bench/testing/platform_builders.py (27)

PlatformFactory (188-194)

create_command_builder (190-194)

build_env_command (8-9)

build_env_command (49-58)

build_env_command (139-144)

build_env_command (197-199)

build_chdir_command (16-17)

build_chdir_command (101-104)

build_chdir_command (149-152)

build_entrypoint_script (36-37)

build_entrypoint_script (121-132)

build_entrypoint_script (167-182)

get_shell_command (32-33)

get_shell_command (118-119)

get_shell_command (164-165)

build_exists_command (28-29)

build_exists_command (115-116)

build_exists_command (161-162)

check_exists_result (40-41)

check_exists_result (134-135)

check_exists_result (184-185)

build_mkdir_command (24-25)

build_mkdir_command (112-113)

build_mkdir_command (158-159)

build_rm_command (20-21)

build_rm_command (106-110)

build_rm_command (154-156)

projects/container_bench/testing/utils.py (1)

get_tmp_fd (16-28)

projects/remote/lib/remote_access.py (1)

mkdir (211-220)

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (2)

projects/matrix_benchmarking/visualizations/helpers/plotting/units.py (1)

format_duration (28-46)

projects/matrix_benchmarking/visualizations/helpers/plotting/html.py (1)

info_card (6-34)

🪛 Ruff (0.14.2)

projects/container_bench/visualizations/benchmark/store/parsers.py

90-90: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

93-93: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

96-96: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

99-99: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

104-104: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

107-107: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

projects/container_bench/testing/platform_builders.py

146-146: Unused method argument: service_name

(ARG002)

146-146: Unused method argument: binary_path

(ARG002)

🔇 Additional comments (20)

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/linux.yml (2)

5-77: Verify directory creation occurs before shell task execution.

The shell task starting at Line 5 immediately attempts to write to {{ artifact_extra_logs_dir }}/artifacts/system_info.txt without explicitly checking if the directory exists. Ensure the included shared task (Line 2-3) successfully creates this directory structure before this shell task executes.

2-3: No issues found—include path and shared task are correctly configured.

The shared task file exists and creates the {{ artifact_extra_logs_dir }}/artifacts directory as expected by subsequent tasks. YAML syntax is valid, and the task properly handles both Unix/Linux/macOS and Windows platforms.

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/windows.yml (1)

2-3: No issues found—the implementation is correct.

The verification confirms:

The shared task file exists and correctly creates {{ artifact_extra_logs_dir }}\artifacts on Windows via ansible.windows.win_file

Downstream code in lines 6–46 consistently uses paths under {{ artifact_extra_logs_dir }}\artifacts

The variable artifact_extra_logs_dir is documented as a required external input (see shared_tasks/run_benchmark.yml line 5) and is properly inherited by included tasks—a standard Ansible pattern

Path structure is consistent across all related task files (run_benchmark.yml, create_artifacts_dir.yml, capture_container_engine_info/*)

The refactoring consolidates artifact directory creation correctly without breaking variable scope or path handling.
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml (3)
71-77: Shared benchmark body templates verified — files exist at referenced paths.

Confirmed that both unix_default_benchmark_body.yml and windows_default_benchmark_body.yml exist at projects/container_bench/toolbox/shared_tasks/, correctly matching the relative paths (../../shared_tasks/) from the current file. The includes are properly configured.

11-13: No issues found; the variable mapping task is correctly configured and produces the expected aliases.

The verification confirms that:

map_variables.yml exists at projects/container_bench/toolbox/shared_tasks/map_variables.yml with proper structure

The included task correctly creates the aliases: binary_path, additional_args, exec_time_path, and rootfull using the provided role_name parameter

The aliases are properly used in the main task file (lines 25, 30, 37)

The code is correct and requires no changes.

2-9: Assertion with when: false prevents validation, but appears to be intentional design in this codebase.

The assertion task never runs due to when: false. This pattern appears consistently across three related container_bench benchmark files (exec, helloworld, and this large_build_context benchmark), each with the same comment about validating variable definitions without runtime enforcement. If the goal is documentation only, the assertion serves no functional purpose. If runtime validation is intended, remove when: false:
 - name: Validate required variables are defined
   assert:
     that:
       - container_bench_image_build_large_build_context_benchmark_exec_time_path is defined
       - container_bench_image_build_large_build_context_benchmark_binary_path is defined
       - container_bench_image_build_large_build_context_benchmark_rootfull is defined
       - container_bench_image_build_large_build_context_benchmark_additional_args is defined
-  when: false  # This task never runs but validates variable definitions
Confirm whether this is intended as documentation or if enabling the assertion was the goal.
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/windows.yml (1)

2-3: The review comment's concern is based on incorrect assumptions about variable definition.

The shared task create_artifacts_dir.yml correctly uses artifact_extra_logs_dir as an external prerequisite variable (not something it should define). This is consistent with the project's architectural pattern where artifact_extra_logs_dir is provided by the calling playbook. The shared task path resolves correctly, and the refactoring to consolidate directory creation is appropriate.

Likely an incorrect or invalid review comment.

projects/container_bench/visualizations/benchmark/plotting/metrics.py (3)

1-40: LGTM: Clean constant definitions and imports.

The module-level constants (BYTES_TO_MEGABYTES, color definitions, METRIC_TYPES) are well-organized and the rgba color values with appropriate alpha channels for fills are correctly specified.

48-60: Good fix: Raw numeric values now have consistent percentiles.

The helper correctly handles both SimpleNamespace (with percentile data) and raw numeric values. Setting all three percentiles to the same value (line 59) for raw numeric inputs ensures no jitter range is displayed, which is the correct behavior for single-value measurements.

This addresses the previous review concern about incorrect jitter visualization.

62-151: LGTM: Data processing logic is sound.

The three processing functions correctly handle their respective metric types:

Network and disk metrics are consistently transformed through _to_mb_s, ensuring all values are SimpleNamespace objects

Single metrics (CPU, power, memory) preserve the raw structure from the data source

The orchestration in generate_usage_data appropriately delegates to the correct processor based on metric type.

projects/container_bench/visualizations/benchmark/plotting/report.py (1)

109-109: LGTM! Metric migration is correct.

The switch from exec_time to execution_time_95th_percentile aligns with the PR's migration to percentile-based metrics and is consistent with changes across the visualization layer.

projects/container_bench/visualizations/benchmark/plotting/utils/config.py (2)

12-16: LGTM! Configuration exclusions are appropriate.

Adding test.podman.repo_version and test.docker.repo_version to the exclusion set is reasonable for filtering platform-specific implementation details from configuration labels.

173-174: LGTM! Metric extraction updated correctly.

The switch to execution_time_95th_percentile and addition of jitter align with the parser's new metric calculations and the broader migration to percentile-based reporting.

projects/container_bench/visualizations/benchmark/store/parsers.py (2)

146-175: LGTM! Percentile calculation is correct.

The _calculate_percentile function correctly implements the linear interpolation method for percentile calculation, which is a standard approach. The logic handles both integer and fractional ranks appropriately, and edge cases (None, empty data) are properly addressed.

177-181: LGTM! Jitter calculation is appropriate.

The _calculate_jitter function correctly computes execution time variability as the average of absolute consecutive differences. The None return for insufficient data is properly handled by downstream consumers (format_duration returns "N/A" for None, and comparison_report.py line 150 provides a defensive zero default).

projects/container_bench/visualizations/benchmark/plotting/comparison_report.py (2)

33-37: LGTM! Delta calculation updated correctly.

The performance delta calculation properly migrates to execution_time_95th_percentile and correctly filters out invalid (zero or negative) values before computing the delta.

131-144: LGTM! Performance comparison table updated correctly.

The table construction properly migrates to execution_time_95th_percentile for sorting, fastest-time detection, and header labeling. The logic is consistent and correct.

projects/container_bench/visualizations/benchmark/plotting/utils/shared.py (3)

87-109: LGTM! Execution time content helper updated correctly.

The signature change from (exec_time, runs) to (exec_time_95th_percentile, jitter) is a breaking change, but all callers in the codebase have been updated consistently. The new display format with explicit jitter information is clearer and more informative.

165-184: LGTM! Optional parameters improve API flexibility.

Making title and field_mappings optional with sensible defaults improves the API ergonomics. The changes are used consistently across the codebase, and the default values are appropriate.

134-146: LGTM! Summary items correctly reflect new metrics.

The migration to execution_time_95th_percentile and jitter in summary items is correct, and the updated label "Execution Time (95th Percentile)" clearly communicates the metric to users.

projects/container_bench/testing/platform_builders.py

...iner_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml

Signed-off-by: Jan Rodák <[email protected]>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (5)

projects/container_bench/testing/container_engine.py (1)
79-103: Address remaining issues from previous review and add error handling.

While the platform-specific mkdir concern has been resolved by using remote_access.create_remote_directory(), two critical issues from the previous review remain unaddressed:

Line 96: Image filename sanitization is still incomplete and can cause issues:

Doesn't handle @ characters in digest references (e.g., image@sha256:...)

Can produce filename collisions (e.g., registry.io/app:v1.0 → registry_io_app_v1_0.tar)

Doesn't sanitize all special characters that may be invalid in filenames

Lines 99-102: No error handling for pull/save operations. If these commands fail, the function continues silently, potentially leaving the system in an inconsistent state.

Line 103: The return value of rm_images() is not checked. If image removal fails, there's no indication to the caller.

Consider applying the sanitization approach suggested in the previous review (hash-based with readable prefix) and adding proper error handling:
+        import hashlib
+        
         for image in images:
             logging.info(f"Pulling container image: {image} ...")
-            image_filename = image.replace("/", "_").replace(":", "_").replace(".", "_") + ".tar"
+            # Use hash to avoid collisions and handle all special characters
+            image_hash = hashlib.sha256(image.encode()).hexdigest()[:12]
+            # Keep a readable prefix
+            image_prefix = image.split("/")[-1].split(":")[0][:20]
+            safe_prefix = "".join(c if c.isalnum() else "_" for c in image_prefix)
+            image_filename = f"{safe_prefix}_{image_hash}.tar"
+            
             if remote_access.exists(dest / image_filename):
                 continue
+                
             cmd = f"{self.get_command()} pull {image}"
-            remote_access.run_with_ansible_ssh_conf(self.base_work_dir, cmd)
+            try:
+                remote_access.run_with_ansible_ssh_conf(self.base_work_dir, cmd)
+            except Exception as e:
+                logging.error(f"Failed to pull image {image}: {e}")
+                raise
+                
             cmd = f"{self.get_command()} save -o {dest / image_filename} {image}"
-            remote_access.run_with_ansible_ssh_conf(self.base_work_dir, cmd)
-        self.rm_images(images)
+            try:
+                remote_access.run_with_ansible_ssh_conf(self.base_work_dir, cmd)
+            except Exception as e:
+                logging.error(f"Failed to save image {image}: {e}")
+                raise
+                
+        if not self.rm_images(images):
+            logging.warning(f"Failed to remove images after saving: {images}")
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml (2)
2-9: Clarify the purpose of the assert with when: false.

Same as the helloworld benchmark: this assertion will never execute at runtime. If this is intended as documentation or static validation for IDEs/linters, consider adding a comment or using comments instead of this task.

17-24: Use image_to_pull variable in prepare commands.

The image_to_pull variable is defined on line 17 but prepare_commands on line 19 hardcodes the image. This creates inconsistency and maintenance burden.

Apply this diff to reference the variable consistently:
     prepare_commands: |
-      {{ binary_path }} {{ additional_args }} run --name benchmark_exec -d registry.fedoraproject.org/fedora:latest sleep infinity
+      {{ binary_path }} {{ additional_args }} run --name benchmark_exec -d {{ image_to_pull }} sleep infinity
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml (1)
17-21: Use image_to_pull variable in benchmark command.

The image_to_pull variable is defined on line 17 but the benchmark_command on line 21 hardcodes the image name. This creates maintenance burden and inconsistency.

Apply this diff to reference the variable consistently:
-      {{ binary_path }} {{ additional_args }} run --rm quay.io/podman/hello
+      {{ binary_path }} {{ additional_args }} run --rm {{ image_to_pull }}
projects/container_bench/testing/platform_builders.py (1)
149-152: Stop double-quoting the Unix chdir path

shlex.quote() already returns a fully shell-escaped token; wrapping it in additional single quotes turns /tmp/foo bar into cd ''/tmp/foo bar'', which the shell parses as two arguments, so chdir fails on any path needing quoting. Drop the outer quotes and use the quoted token directly.
-        return f"cd '{shlex.quote(str(chdir))}'"
+        return f"cd {shlex.quote(str(chdir))}"

🧹 Nitpick comments (1)

projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml (1)
2-9: Clarify the purpose of the assert with when: false.

The assertion task will never execute at runtime (due to when: false), so it serves no runtime validation purpose. If this is intended as documentation/static validation for the IDE or linter, consider adding a comment explaining this pattern. Otherwise, this task should be removed.

If this task is meant to be documentation only, consider using comments instead:
-- name: Validate required variables are defined
-  assert:
-    that:
-      - container_bench_helloworld_benchmark_exec_time_path is defined
-      - container_bench_helloworld_benchmark_binary_path is defined
-      - container_bench_helloworld_benchmark_rootfull is defined
-      - container_bench_helloworld_benchmark_additional_args is defined
-  when: false  # This task never runs but validates variable definitions
+# Requires: container_bench_helloworld_benchmark_exec_time_path,
+#           container_bench_helloworld_benchmark_binary_path,
+#           container_bench_helloworld_benchmark_rootfull,
+#           container_bench_helloworld_benchmark_additional_args
Alternatively, remove this task if there's no enforcement need.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e65fb3b and f2da82e.

📒 Files selected for processing (30)

projects/container_bench/testing/config.yaml (2 hunks)
projects/container_bench/testing/config_manager.py (3 hunks)
projects/container_bench/testing/container_engine.py (3 hunks)
projects/container_bench/testing/platform_builders.py (4 hunks)
projects/container_bench/testing/prepare.py (4 hunks)
projects/container_bench/testing/remote_access.py (4 hunks)
projects/container_bench/testing/test_container_bench.py (1 hunks)
projects/container_bench/testing/utils.py (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/main.yaml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/unix.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/windows.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/darwin.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/linux.yml (1 hunks)
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/windows.yml (1 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/main.yml (1 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/main.yml (1 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml (1 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/unix.yml (0 hunks)
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/windows.yml (0 hunks)
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/create_artifacts_dir.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/map_variables.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/pull_image.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/run_benchmark.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/unix_default_benchmark_body.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/windows_default_benchmark_body.yml (1 hunks)

💤 Files with no reviewable changes (6)

projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/windows.yml
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/windows.yml
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/unix.yml
projects/container_bench/toolbox/container_bench_exec_benchmark/tasks/unix.yml
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/windows.yml
projects/container_bench/toolbox/container_bench_helloworld_benchmark/tasks/unix.yml

🚧 Files skipped from review as they are similar to previous changes (10)

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/linux.yml
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml
projects/container_bench/toolbox/container_bench_capture_system_state/tasks/darwin.yml
projects/container_bench/toolbox/shared_tasks/unix_default_benchmark_body.yml
projects/container_bench/testing/prepare.py
projects/container_bench/toolbox/shared_tasks/windows_default_benchmark_body.yml
projects/container_bench/testing/config_manager.py
projects/container_bench/testing/utils.py
projects/container_bench/testing/test_container_bench.py
projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/windows.yml

🧰 Additional context used

🧬 Code graph analysis (2)

projects/container_bench/testing/container_engine.py (2)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

get_container_images_config (201-209)

projects/container_bench/testing/remote_access.py (3)

exists (146-160)

create_remote_directory (163-175)

run_with_ansible_ssh_conf (86-143)

projects/container_bench/testing/remote_access.py (4)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

is_windows (90-91)

projects/container_bench/testing/platform_builders.py (27)

PlatformFactory (188-194)

create_command_builder (190-194)

build_env_command (8-9)

build_env_command (49-58)

build_env_command (139-144)

build_env_command (197-199)

build_chdir_command (16-17)

build_chdir_command (101-104)

build_chdir_command (149-152)

build_entrypoint_script (36-37)

build_entrypoint_script (121-132)

build_entrypoint_script (167-182)

get_shell_command (32-33)

get_shell_command (118-119)

get_shell_command (164-165)

build_exists_command (28-29)

build_exists_command (115-116)

build_exists_command (161-162)

check_exists_result (40-41)

check_exists_result (134-135)

check_exists_result (184-185)

build_mkdir_command (24-25)

build_mkdir_command (112-113)

build_mkdir_command (158-159)

build_rm_command (20-21)

build_rm_command (106-110)

build_rm_command (154-156)

projects/container_bench/testing/utils.py (1)

get_tmp_fd (16-28)

projects/remote/lib/remote_access.py (1)

mkdir (211-220)

🪛 Ruff (0.14.2)

projects/container_bench/testing/platform_builders.py

146-146: Unused method argument: service_name

(ARG002)

146-146: Unused method argument: binary_path

(ARG002)

🔇 Additional comments (15)

projects/container_bench/testing/config.yaml (2)

103-108: LGTM! Configuration aligns with the new image preparation flow.

The container_images configuration block is well-structured and provides sensible defaults for image pre-pulling functionality.

159-159: LGTM! Sensible default for image caching.

Defaulting to false allows images to be reused across test runs, improving performance while still providing the option to clean up when needed.

projects/container_bench/testing/container_engine.py (2)

1-1: LGTM! Import is necessary for the new path operations.

The Path import is properly utilized in the store_container_images_as_tar method.

130-139: LGTM! Sensible refactor for batch image removal.

Renaming from rm_image to rm_images and accepting a list enables more efficient batch operations by executing a single command instead of multiple sequential calls.

projects/container_bench/toolbox/shared_tasks/run_benchmark.yml (1)

16-33: Verify platform-specific path handling and command execution.

The Unix shell command uses {{ exec_time_path | dirname }}/.venv/bin/python to locate the Python interpreter, while Windows uses {{ exec_time_path | dirname }}\.venv\Scripts\python.exe. Ensure the venv structure is consistent across both platforms before execution. Additionally, verify that artifact_extra_logs_dir is a valid base path on both platforms, as it's used to construct output file paths.

projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/unix.yml (1)

2-3: Refactoring looks good.

Properly consolidates artifact directory creation into the shared task, improving maintainability.

projects/container_bench/toolbox/container_bench_capture_container_engine_info/tasks/main.yaml (1)

2-4: LGTM.

The rename improves task naming clarity without changing functionality.

projects/container_bench/toolbox/shared_tasks/create_artifacts_dir.yml (1)

6-17: LGTM.

Clean platform-aware implementation with appropriate permission modes for Unix-like systems and proper Windows path handling.

projects/container_bench/toolbox/shared_tasks/pull_image.yml (2)

36-44: LGTM.

The image name sanitization using regex_replace('[^a-zA-Z0-9_-]', '_') is appropriate for creating valid filenames while preserving readability. The OS-specific path separator handling is correct.

58-66: Code is correct; original concern is not applicable.

The stat and fact-setting tasks have properly matched when conditions that are mutually exclusive and exhaustive. On Unix systems, tarfile_stat_unix is registered and used; on Windows, tarfile_stat_windows is registered and used. No undefined variable errors are possible. The | default(false) filters provide additional safety but are not required for correctness.

Likely an incorrect or invalid review comment.
projects/container_bench/toolbox/shared_tasks/map_variables.yml (1)
8-13: Consider using Jinja2's dict.get() for safer variable mapping with fallback defaults.

The current implementation relies on dynamic variable lookup via vars[role_name + '_suffix']. While source variables are defined in defaults/main/config.yml, Ansible's set_fact won't error if a source variable is undefined—the alias will silently become undefined. This could cause failures downstream when the undefined variable is used in templates.

Instead of relying on implicit defaults, use Jinja2's dict.get() method to provide explicit fallbacks:
binary_path: "{{ vars.get(role_name + '_binary_path', '/usr/bin/docker') }}"
additional_args: "{{ vars.get(role_name + '_additional_args', '') }}"
exec_time_path: "{{ vars.get(role_name + '_exec_time_path', '/tmp/exec_time') }}"
rootfull: "{{ vars.get(role_name + '_rootfull', false) }}"
Note: The validation task at lines 3–9 (with when: false) acknowledges this concern but is intentionally disabled; enabling it or using dict.get() would provide explicit error handling.
projects/container_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml (4)

42-53: Verify Jinja2 loop formatting in prepare commands.

A past review comment flagged a critical newline issue in the Jinja2 loop (lines 50–52). The current code shows the loop properly terminated with {% endfor %} on a separate line. Confirm that when rendered, each dd command appears on its own line with proper shell syntax.

To verify: run the playbook in debug mode or add a debug task that registers and prints prepare_commands to confirm the rendered output has each command on a separate line.

71-77: All shared benchmark body tasks exist and are properly configured.

Both unix_default_benchmark_body.yml and windows_default_benchmark_body.yml exist and contain explicit validation tasks that confirm the presence of prepare_commands, benchmark_command, and cleanup_commands variables before execution. Both files properly consume these variables in their respective shell execution tasks. The code at lines 71–77 is correctly structured and compatible.

23-25: Variables are correctly defined before use—no action needed.

The include_tasks statement on line 12 executes before line 23, and binary_path and additional_args are properly set by map_variables.yml using role_name-based variable lookup. The assert task on lines 3–9 confirms the underlying variables (container_bench_image_build_large_build_context_benchmark_binary_path and container_bench_image_build_large_build_context_benchmark_additional_args) are defined before the include runs.

2-13: Variable mapping is correct and produces expected aliases.

The validation asserts (lines 5–8) verify the four long-form variables are defined. The map_variables.yml include (lines 11–13) dynamically constructs short aliases by concatenating role_name + '_binary_path', role_name + '_additional_args', role_name + '_exec_time_path', and role_name + '_rootfull'. With role_name: "container_bench_image_build_large_build_context_benchmark", this produces exactly the aliases used on lines 25, 30, and 37. All four validated variables are correctly mapped to their shorter forms.

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/windows.yml

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (3)

projects/container_bench/testing/prepare.py (2)
156-156: Add error handling for image preparation.

Same issue as with Docker (line 112): the store_container_images_as_tar() call lacks error handling, which could allow failures to go unnoticed or leave the system inconsistent.
     podman = ContainerEngine("podman")
-    podman.store_container_images_as_tar()
+    try:
+        podman.store_container_images_as_tar()
+    except Exception as e:
+        logging.error(f"Error storing Podman container images: {e}")
+        raise
     podman.cleanup()
112-112: Add error handling for image preparation.

The store_container_images_as_tar() call lacks error handling. If image pulling, saving, or removal fails, the cleanup on line 113 will still proceed, potentially leaving the system in an inconsistent state. Wrapping this in a try/except would allow you to log the failure and decide whether to continue or abort.
     docker = ContainerEngine("docker")
-    docker.store_container_images_as_tar()
+    try:
+        docker.store_container_images_as_tar()
+    except Exception as e:
+        logging.error(f"Error storing Docker container images: {e}")
+        raise
     docker.cleanup()
projects/container_bench/testing/utils.py (1)
99-108: Validate that images_dir is a non-empty string.

The code checks "dir" not in container_images_config but doesn't validate that images_dir is a non-empty string. If images_dir is None, an empty string, or whitespace-only, constructing Path(images_dir) could lead to unexpected behavior or errors.
     container_images_config = ConfigManager.get_container_images_config()
     if "dir" not in container_images_config:
         return
     images_dir = container_images_config['dir']
+    if not images_dir or (isinstance(images_dir, str) and not images_dir.strip()):
+        logging.info("Skipping container images cleanup: directory path is empty.")
+        return
     dest = Path(images_dir)

🧹 Nitpick comments (2)

projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (1)
12-96: Consider refactoring to eliminate task duplication via loops or parameterization.

The file repeats the same cleanup operations (stop, remove containers/images/volumes, system prune) across Unix/Linux and Windows variants, totaling 12 nearly identical tasks. Each operation also duplicates variable expansion multiple times within a single command.

This pattern:

Increases maintenance burden: a command change requires updates in 12+ places

Obscures the core cleanup logic behind repetition

Makes it harder to test and document the actual cleanup strategy

Consider consolidating using Ansible loops (loop) or dynamic task inclusion to apply a single operation definition across variants and OS families.

Example refactor direction:
- name: "{{ item.description }} ({{ ansible_os_family }})"
  shell: "..." # or win_shell for Windows
  loop:
    - { cmd: "stop", description: "Stop all running containers" }
    - { cmd: "rm", description: "Remove all containers" }
    - { cmd: "rmi", description: "Remove all images" }
  # ... when guard based on OS
projects/container_bench/testing/platform_builders.py (1)
57-57: Optional: Escaping environment variable keys may be unnecessary.

Environment variable names are typically alphanumeric with underscores. Applying escape_powershell_single_quote(k) to the key is likely unnecessary since keys are controlled by the code rather than user input. You may simplify to just use k directly.
-            env_commands.append(f"$env:{escape_powershell_single_quote(k)}='{escape_powershell_single_quote(v)}'")
+            env_commands.append(f"$env:{k}='{escape_powershell_single_quote(v)}'")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2da82e and dcabc6a.

📒 Files selected for processing (9)

projects/container_bench/testing/config.yaml (2 hunks)
projects/container_bench/testing/platform_builders.py (4 hunks)
projects/container_bench/testing/prepare.py (4 hunks)
projects/container_bench/testing/remote_access.py (4 hunks)
projects/container_bench/testing/test_container_bench.py (1 hunks)
projects/container_bench/testing/utils.py (1 hunks)
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/pull_image.yml (1 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/config.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

projects/container_bench/toolbox/shared_tasks/pull_image.yml

🧰 Additional context used

🧬 Code graph analysis (4)

projects/container_bench/testing/utils.py (2)

projects/container_bench/testing/config_manager.py (1)

get_container_images_config (201-209)

projects/container_bench/testing/remote_access.py (3)

exists (146-160)

prepare (14-83)

remove_remote_file (178-190)

projects/container_bench/testing/test_container_bench.py (1)

projects/container_bench/testing/config_manager.py (1)

get_test_config (128-143)

projects/container_bench/testing/prepare.py (3)

projects/container_bench/testing/remote_access.py (3)

remove_remote_file (178-190)

prepare (14-83)

exists (146-160)

projects/container_bench/testing/utils.py (1)

cleanup_container_images (99-108)

projects/container_bench/testing/container_engine.py (1)

store_container_images_as_tar (79-103)

projects/container_bench/testing/remote_access.py (4)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

is_windows (90-91)

projects/container_bench/testing/platform_builders.py (27)

PlatformFactory (188-194)

create_command_builder (190-194)

build_env_command (8-9)

build_env_command (49-58)

build_env_command (139-144)

build_env_command (197-199)

build_chdir_command (16-17)

build_chdir_command (101-104)

build_chdir_command (149-152)

build_entrypoint_script (36-37)

build_entrypoint_script (121-132)

build_entrypoint_script (167-182)

get_shell_command (32-33)

get_shell_command (118-119)

get_shell_command (164-165)

build_exists_command (28-29)

build_exists_command (115-116)

build_exists_command (161-162)

check_exists_result (40-41)

check_exists_result (134-135)

check_exists_result (184-185)

build_mkdir_command (24-25)

build_mkdir_command (112-113)

build_mkdir_command (158-159)

build_rm_command (20-21)

build_rm_command (106-110)

build_rm_command (154-156)

projects/container_bench/testing/utils.py (1)

get_tmp_fd (16-28)

projects/remote/lib/remote_access.py (1)

mkdir (211-220)

🪛 Ruff (0.14.2)

projects/container_bench/testing/platform_builders.py

146-146: Unused method argument: service_name

(ARG002)

146-146: Unused method argument: binary_path

(ARG002)

🔇 Additional comments (10)

projects/container_bench/visualizations/benchmark/plotting/utils/config.py (2)

14-14: LGTM: Configuration exclusion added.

Adding "test.podman.machine_provider" to the excluded keys is consistent with the existing pattern of excluding test-specific configuration keys from labels.

173-174: Metrics migration is complete and safe.

The verification confirms that execution_time_95th_percentile and execution_time_jitter are always assigned in parsers.py before being accessed in config.py. All downstream visualization code defensively uses .get() with default values, and no stale field references remain. The 95th percentile is a statistically sound choice for benchmarking metrics.

projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (1)

7-10: Variable validation is well-structured.

The fail task correctly validates the required binary_path variable upfront, preventing silent failures downstream.

projects/container_bench/testing/config.yaml (1)

103-108: LGTM!

The container images configuration is well-structured with pull control, directory path, and image list. This provides flexibility for pre-pulling and caching images.

projects/container_bench/testing/test_container_bench.py (1)

88-93: LGTM!

The platform skip logic is correctly implemented. It checks for key existence before removal and logs only when a platform is actually skipped.

projects/container_bench/testing/prepare.py (1)

38-40: LGTM!

The container images cleanup integration is correctly implemented, checking the configuration flag before invoking the cleanup utility.

projects/container_bench/testing/remote_access.py (4)

86-143: LGTM!

The unified run_with_ansible_ssh_conf function effectively consolidates Windows and Unix remote execution using the builder pattern. The local execution path correctly branches based on platform, and the builder-based command construction provides clean cross-platform abstractions.

146-160: LGTM!

The exists function cleanly abstracts local vs. remote existence checks using the platform-aware builder pattern. The use of check_exists_result to interpret platform-specific return values is elegant.

163-175: LGTM!

The create_remote_directory function properly handles both local and remote directory creation with appropriate platform-specific command building.

178-190: LGTM!

The remove_remote_file function correctly implements both local (using shutil/pathlib) and remote (using builder-generated commands) file/directory removal with proper recursive handling.

coderabbitai · 2025-11-03T13:40:36Z

projects/container_bench/testing/platform_builders.py

+    def build_chdir_command(self, chdir):
+        if chdir is None:
+            return "cd $HOME"
+        return f"cd '{shlex.quote(str(chdir))}'"


⚠️ Potential issue | 🟠 Major

Remove redundant single quotes around shlex.quote.

shlex.quote() already adds quotes when necessary. Wrapping the result in additional single quotes creates confusing double-quoting that, while it may work in simple cases, produces hard-to-read commands and can behave unexpectedly with complex paths. Other Unix commands in this file (lines 156, 159, 162) correctly use shlex.quote() without extra quotes.

- return f"cd '{shlex.quote(str(chdir))}'" + return f"cd {shlex.quote(str(chdir))}"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

return f"cd '{shlex.quote(str(chdir))}'"

return f"cd {shlex.quote(str(chdir))}"

🤖 Prompt for AI Agents

In projects/container_bench/testing/platform_builders.py around line 152, the return value wraps shlex.quote(str(chdir)) in extra single quotes causing redundant/double quoting; remove the outer single quotes so the function returns cd followed by the shlex.quote(...) result (matching the style used on lines 156, 159, 162) to produce correct, consistent shell-safe paths.

projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml

…oss-platform support Signed-off-by: Jan Rodák <[email protected]>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

projects/container_bench/testing/platform_builders.py (1)
149-152: Remove redundant single quotes around shlex.quote().

shlex.quote() already adds quotes when necessary. Wrapping the result in additional single quotes creates double-quoting that is inconsistent with other Unix commands in this file (lines 156, 159, 162).

Apply this diff:
-        return f"cd '{shlex.quote(str(chdir))}'"
+        return f"cd {shlex.quote(str(chdir))}"
projects/container_bench/testing/utils.py (1)
99-108: Guard against None or empty directory path.

The function checks if the "dir" key exists but doesn't validate the value. If images_dir is None, Path(None) raises TypeError. If it's an empty string, Path("") resolves to the current directory, and recursive=True removal would delete the working directory.

Apply this diff to validate the directory value:
 def cleanup_container_images(base_work_dir):
     container_images_config = ConfigManager.get_container_images_config()
     if "dir" not in container_images_config:
         return
     images_dir = container_images_config['dir']
+    if not images_dir or not str(images_dir).strip():
+        logging.info("Skipping container images cleanup: no directory configured.")
+        return
+    
     dest = Path(images_dir)
 
     if remote_access.exists(dest):
         logging.info(f"Removing container images directory {dest} ...")
         remote_access.remove_remote_file(base_work_dir, dest, recursive=True)

🧹 Nitpick comments (3)

projects/container_bench/toolbox/shared_tasks/pull_image.yml (3)
1-8: Documentation looks clear, but add variable structure expectations.

The usage comments could explicitly document the structure of required variables, especially environment which is accessed as vars['environment'][0]['HOME'] on line 21. Clarify that environment should be a list of dicts with a 'HOME' key.
# Usage:
# - include_tasks: "../../shared_tasks/pull_image.yml"
#   vars:
#     image_name: "quay.io/example/image:tag"
#     binary_path: "/path/to/container/binary"
#     additional_args: ""  # optional
#     environment:         # expected as list: [{"HOME": "/path"}]
#       - "{{ ansible_env }}"
# Requires: binary_path, image_name, environment variables
# Optional: additional_args, rootfull
36-44: Image name sanitization is correct; consider documenting the mapping.

The sanitization logic (replacing non-alphanumeric/dash/underscore characters with underscores) is sound, but consider documenting that the sanitized name is used only for the tarfile path, not the image registry reference. This prevents confusion about why image_name and image_tarfile differ.

Add a comment above the regex_replace:
- name: Set image tarfile path (Unix/Linux/macOS)
  set_fact:
    # Note: Sanitize image_name for filesystem compatibility (tarfile name only)
    # The actual image_name remains unchanged for registry pulls
    image_tarfile: "{{ images_path }}/{{ image_name | regex_replace('[^a-zA-Z0-9_-]', '_') }}.tar"
68-85: Load/pull logic is well-structured; ensure error handling upstream.

The four variants (Unix load, Unix pull, Windows load, Windows pull) are clearly separated and logically sound. However, Ansible's shell tasks swallow command errors by default. If either load or pull fails, the task will mark as failed but won't capture stderr details. Consider adding error handling downstream (in callers) or add ignore_errors with explicit error checking if needed.

For better observability, optionally add register and changed_when to distinguish between success, no-op, and failure states:
- name: Load image from local tarfile (Unix/Linux/macOS)
  become: "{{ rootfull | default(omit) }}"
  shell: "{{ binary_path }} {{ additional_args | default('') }} load -i {{ image_tarfile }}"
  register: load_result
  when: ansible_os_family != 'Windows' and tarfile_exists
  # Adjust based on expected output (e.g., 'Loaded image ...' indicates success)
  changed_when: '"Loaded" in load_result.stdout'

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dcabc6a and b3d1b97.

📒 Files selected for processing (9)

projects/container_bench/testing/config.yaml (2 hunks)
projects/container_bench/testing/platform_builders.py (4 hunks)
projects/container_bench/testing/prepare.py (4 hunks)
projects/container_bench/testing/remote_access.py (4 hunks)
projects/container_bench/testing/test_container_bench.py (1 hunks)
projects/container_bench/testing/utils.py (1 hunks)
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml (1 hunks)
projects/container_bench/toolbox/shared_tasks/pull_image.yml (1 hunks)
projects/container_bench/visualizations/benchmark/plotting/utils/config.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

projects/container_bench/testing/test_container_bench.py
projects/container_bench/toolbox/shared_tasks/cleanup_container_resources.yml
projects/container_bench/testing/prepare.py

🧰 Additional context used

🧬 Code graph analysis (2)

projects/container_bench/testing/utils.py (2)

projects/container_bench/testing/remote_access.py (2)

remove_remote_file (178-190)

exists (146-160)

projects/container_bench/testing/config_manager.py (1)

get_container_images_config (201-209)

projects/container_bench/testing/remote_access.py (4)

projects/container_bench/testing/config_manager.py (2)

ConfigManager (80-253)

is_windows (90-91)

projects/container_bench/testing/platform_builders.py (27)

PlatformFactory (188-194)

create_command_builder (190-194)

build_env_command (8-9)

build_env_command (49-58)

build_env_command (139-144)

build_env_command (197-199)

build_chdir_command (16-17)

build_chdir_command (101-104)

build_chdir_command (149-152)

build_entrypoint_script (36-37)

build_entrypoint_script (121-132)

build_entrypoint_script (167-182)

get_shell_command (32-33)

get_shell_command (118-119)

get_shell_command (164-165)

build_exists_command (28-29)

build_exists_command (115-116)

build_exists_command (161-162)

check_exists_result (40-41)

check_exists_result (134-135)

check_exists_result (184-185)

build_mkdir_command (24-25)

build_mkdir_command (112-113)

build_mkdir_command (158-159)

build_rm_command (20-21)

build_rm_command (106-110)

build_rm_command (154-156)

projects/container_bench/testing/utils.py (1)

get_tmp_fd (16-28)

projects/remote/lib/remote_access.py (1)

mkdir (211-220)

🪛 Ruff (0.14.3)

projects/container_bench/testing/platform_builders.py

146-146: Unused method argument: service_name

(ARG002)

146-146: Unused method argument: binary_path

(ARG002)

🔇 Additional comments (18)

projects/container_bench/visualizations/benchmark/plotting/utils/config.py (2)

14-14: LGTM!

The addition of "test.podman.machine_provider" to the excluded keys is consistent with the existing pattern and aligns with the PR's objective to refine configuration reporting.

173-174: Resolved — execution_time_95th_percentile and jitter are implemented and used.

parsers.py computes execution_time_95th_percentile and execution_time_jitter, GetInfo returns both fields, and downstream visualization code (report, comparison_report, utils/shared) consistently uses the new names; no stale exec_time references found.

projects/container_bench/testing/platform_builders.py (7)

3-3: LGTM: shlex import for safe shell quoting.

The import is necessary for the Unix command builder's use of shlex.quote() to safely escape shell arguments.

15-41: LGTM: Cross-platform abstraction extended appropriately.

The new abstract methods provide a clean interface for platform-specific file operations, shell selection, and entrypoint scripting. All methods are implemented by both concrete builders.

44-46: LGTM: PowerShell single-quote escaping is correct.

PowerShell requires single quotes to be doubled ('') for escaping. The implementation is correct and used consistently throughout the Windows builder.

49-58: LGTM: Environment variables are properly escaped for PowerShell.

Both keys and values are escaped using escape_powershell_single_quote(), preventing command injection through environment variable values containing single quotes or special characters.

101-135: LGTM: Windows platform commands are properly implemented.

All methods correctly escape paths and values for PowerShell execution. The entrypoint script appropriately sets $ErrorActionPreference = "Stop" for fail-fast behavior.

139-144: LGTM: Unix environment variables are properly escaped.

Values are correctly escaped using shlex.quote() to prevent shell injection, and the export statement is properly formatted.

154-185: LGTM: Unix platform commands are well-implemented.

The file operations correctly use shlex.quote() for safe path escaping. The entrypoint script appropriately sets bash strict mode (set -o pipefail, errexit, nounset, errtrace) for robust error handling.

projects/container_bench/testing/remote_access.py (5)

6-6: LGTM: Imports support unified remote access pattern.

The shutil import enables local recursive removal, and PlatformFactory provides platform-agnostic command construction.

Also applies to: 11-11

86-143: LGTM: Unified remote execution using PlatformFactory.

The function now uses platform-agnostic command building through PlatformFactory, which improves maintainability and consistency. The local execution messaging correctly differentiates between Windows and Unix platforms.

146-160: LGTM: Platform-agnostic exists check.

The function correctly uses PlatformFactory to build platform-specific existence checks and interpret results appropriately for each platform.

163-175: LGTM: Directory creation handles both local and remote cases.

The function correctly uses path.mkdir(parents=True, exist_ok=True) for local execution and platform-agnostic command building for remote execution.

178-190: LGTM: File removal supports both local and remote execution.

The function correctly handles local execution with shutil.rmtree() for recursive directory removal and path.unlink(missing_ok=True) for files. Remote execution uses platform-agnostic command building. This addresses the past review concern about missing local execution support.

projects/container_bench/testing/config.yaml (3)

103-108: LGTM: Container images configuration is well-structured.

The configuration enables pre-pulling images with a clear directory structure and references to publicly available container images.

159-159: LGTM: Container images cleanup defaults to disabled.

Defaulting to false allows images to be preserved between runs, which is appropriate for development and testing workflows.

211-212: LGTM: Runtime list simplified to standard runtimes.

Focusing on crun and runc simplifies the test matrix while covering the primary OCI runtimes.

projects/container_bench/testing/utils.py (1)

90-90: LGTM: Updated to use unified remote access API.

The calls correctly reference remote_access.remove_remote_file(), aligning with the new unified remote access pattern.

Also applies to: 96-96

coderabbitai · 2025-11-03T15:03:49Z

projects/container_bench/toolbox/shared_tasks/pull_image.yml

+- name: Validate required variables for image pulling
+  fail:
+    msg: "Required variable '{{ item }}' is not defined"
+  when: vars[item] is not defined
+  loop:
+    - binary_path
+    - image_name
+    - environment


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Unresolved runtime risk: callers do not provide required variables.

The validation requires environment and binary_path (lines 10–17), but the task's callers (unix_default_benchmark_body.yml and windows_default_benchmark_body.yml) do not pass these variables when including this task. Additionally, line 21 indexes vars['environment'][0]['HOME'] without checking if the list has elements. This will fail at runtime.

Update the callers to pass the required variables, or refactor this task to provide sensible defaults:

- name: Validate required variables for image pulling assert: that: - binary_path is defined - image_name is defined - environment is defined - (environment | length) > 0 - environment[0].HOME is defined fail_msg: | Required variables missing: - binary_path: path to container binary (e.g., podman, docker) - image_name: container image reference - environment: list with HOME key (e.g., [{"HOME": "/root"}])

Alternatively, provide defaults:

- name: Set defaults for optional variables set_fact: binary_path: "{{ binary_path | default('podman') }}" environment: "{{ environment | default([ansible_env]) }}" additional_args: "{{ additional_args | default('') }}"

Then remove the environment and binary_path from the validation loop if they have safe defaults.

I'll verify the claims in this review comment by examining the file and its callers.

--- 🏁 Script executed: ```shell #!/bin/bash # Find and examine the pull_image.yml file fd -t f "pull_image.yml" --exec cat {}

Length of output: 3105

🏁 Script executed:

#!/bin/bash # Find the caller files fd -t f "unix_default_benchmark_body.yml" -o -t f "windows_default_benchmark_body.yml" --exec cat {}

Length of output: 116

🏁 Script executed:

#!/bin/bash # Search for any includes/imports of pull_image in the codebase rg -n "pull_image" --type yaml

Length of output: 587

Now let me examine the caller files directly:

--- 🏁 Script executed: ```shell #!/bin/bash # Find and examine unix_default_benchmark_body.yml find . -name "unix_default_benchmark_body.yml" -type f -exec cat {} \;

Length of output: 1386

🏁 Script executed:

#!/bin/bash # Find and examine windows_default_benchmark_body.yml find . -name "windows_default_benchmark_body.yml" -type f -exec cat {} \;

Length of output: 1327

🏁 Script executed:

#!/bin/bash # Search for environment variable definition in the codebase rg -n "environment:" --type yaml -A 3 -B 1 | head -100

Length of output: 9275

Let me search for where these required variables are defined:

--- 🏁 Script executed: ```shell #!/bin/bash # Search for binary_path definition rg -n "binary_path" --type yaml -B 2 -A 2

Length of output: 23372

🏁 Script executed:

#!/bin/bash # Search for where unix_default_benchmark_body or windows_default_benchmark_body are included and what vars they receive rg -n "unix_default_benchmark_body\|windows_default_benchmark_body" --type yaml -B 3 -A 3

Length of output: 0

🏁 Script executed:

#!/bin/bash # Check cleanup_container_resources.yml to see if it defines these variables find . -name "cleanup_container_resources.yml" -type f -exec cat {} \;

Length of output: 4169

🏁 Script executed:

#!/bin/bash # Check create_artifacts_dir.yml find . -name "create_artifacts_dir.yml" -type f -exec cat {} \;

Length of output: 521

Let me search for where the environment variable might be defined:

--- 🏁 Script executed: ```shell #!/bin/bash # Search for where environment variable is set as a fact in container_bench rg -n "set_fact" --type yaml -A 5 | rg -A 5 "environment"

Length of output: 3552

🏁 Script executed:

#!/bin/bash # Look for environment definition in defaults or vars files find . -path "*/container_bench/*" -name "*.yml" -o -name "*.yaml" | xargs grep -l "environment:" | head -20

Length of output: 0

🏁 Script executed:

#!/bin/bash # Search in map_variables.yml context - check if environment is supposed to be mapped find . -name "map_variables.yml" -type f -exec cat {} \;

Length of output: 521

Fix missing environment variable and unsafe indexing in pull_image.yml

The validation requires environment which is never defined or passed by callers. Additionally, line 21 unsafely indexes vars['environment'][0]['HOME'] without checking if the list has elements. This task will fail at runtime when image pulling is needed.

Either pass the required variables from the callers (unix_default_benchmark_body.yml and windows_default_benchmark_body.yml):

- name: Pull required container image (with local tarfile optimization) include_tasks: "pull_image.yml" vars: image_name: "{{ image_to_pull }}" environment: "[{{ ansible_env }}]" when: image_to_pull is defined and image_to_pull | length > 0

Or remove environment from the validation loop and provide a safe default in pull_image.yml:

- name: Set default environment if not provided set_fact: environment: "[{{ ansible_env }}]" when: environment is not defined - name: Validate required variables for image pulling fail: msg: "Required variable '{{ item }}' is not defined" when: vars[item] is not defined loop: - binary_path - image_name - name: Validate environment structure assert: that: - environment is defined - environment | length > 0 - environment[0].HOME is defined fail_msg: "environment must be a list with at least one element containing HOME key"

🤖 Prompt for AI Agents

In projects/container_bench/toolbox/shared_tasks/pull_image.yml around lines 10 to 17, the validation currently requires an environment variable that callers never pass and later indexes environment[0].HOME unsafely; remove environment from the required-vars loop, add a task to set a safe default when environment is not provided (e.g. set_fact environment to a single-element list containing ansible_env), and add an assert that verifies environment is a list with length > 0 and that environment[0].HOME is defined before any indexing is performed.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 29, 2025

Honny1 force-pushed the cont-bench-pull-images-time-repr branch from efa2dda to 55257a3 Compare October 29, 2025 20:51

Honny1 marked this pull request as ready for review October 29, 2025 21:08

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 29, 2025

coderabbitai bot reviewed Oct 29, 2025

View reviewed changes

Honny1 force-pushed the cont-bench-pull-images-time-repr branch from 55257a3 to cb81cc6 Compare October 29, 2025 21:31

coderabbitai bot reviewed Oct 29, 2025

View reviewed changes

projects/container_bench/testing/utils.py Outdated Show resolved Hide resolved

projects/container_bench/visualizations/benchmark/plotting/comparison_report.py Show resolved Hide resolved

projects/container_bench/visualizations/benchmark/plotting/metrics.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Oct 31, 2025

View reviewed changes

projects/container_bench/testing/platform_builders.py Outdated Show resolved Hide resolved

projects/container_bench/testing/remote_access.py Show resolved Hide resolved

projects/container_bench/testing/test_container_bench.py Outdated Show resolved Hide resolved

[container_benchmakr] refactor metrics, plots and reports

cc4ccee

Signed-off-by: Jan Rodák <[email protected]>

Honny1 force-pushed the cont-bench-pull-images-time-repr branch from 714bbdd to e65fb3b Compare October 31, 2025 19:21

coderabbitai bot reviewed Oct 31, 2025

View reviewed changes

projects/container_bench/testing/platform_builders.py Show resolved Hide resolved

...iner_bench/toolbox/container_bench_image_build_large_build_context_benchmark/tasks/main.yaml Show resolved Hide resolved

Honny1 added 2 commits October 31, 2025 20:48

[container_benchmark] Deduplicate Ansible tasks for container benchmarks

64de033

Signed-off-by: Jan Rodák <[email protected]>

[container_benchmark] add container image pre pull management

2a1e1ec

Signed-off-by: Jan Rodák <[email protected]>

Honny1 force-pushed the cont-bench-pull-images-time-repr branch from e65fb3b to f2da82e Compare October 31, 2025 19:48

coderabbitai bot reviewed Oct 31, 2025

View reviewed changes

projects/container_bench/toolbox/container_bench_capture_system_state/tasks/windows.yml Show resolved Hide resolved

Honny1 force-pushed the cont-bench-pull-images-time-repr branch from f2da82e to dcabc6a Compare November 3, 2025 13:34

coderabbitai bot reviewed Nov 3, 2025

View reviewed changes

[container_bench] Refactor remote access and platform builders for cr…

b3d1b97

…oss-platform support Signed-off-by: Jan Rodák <[email protected]>

Honny1 force-pushed the cont-bench-pull-images-time-repr branch from dcabc6a to b3d1b97 Compare November 3, 2025 14:53

coderabbitai bot reviewed Nov 3, 2025

View reviewed changes

kpouget merged commit eb2500d into openshift-psap:main Nov 3, 2025
6 of 8 checks passed

This was referenced Nov 4, 2025

[container_bench] fix: resolve ansible-lint issues #858

Merged

[container_benchmarks] Benchmark Implementation #856

Merged

coderabbitai bot mentioned this pull request Nov 20, 2025

[container_benchmark] MacOS and Linux Experiments #870

Merged

	return f"cd '{shlex.quote(str(chdir))}'"
	return f"cd {shlex.quote(str(chdir))}"

[container_benchmark] Pre pull container images and rework reports #854

[container_benchmark] Pre pull container images and rework reports #854

Uh oh!

Conversation

Honny1 commented Oct 29, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci bot commented Oct 29, 2025

Uh oh!

openshift-ci bot commented Oct 29, 2025

Uh oh!

coderabbitai bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Pre-merge checks and finishing touches

Uh oh!

Honny1 commented Oct 29, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

Honny1 commented Oct 29, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 29, 2025 •

edited

Loading