fix: recurse into NumPy object pickle payloads by mldangelo · Pull Request #699 · promptfoo/modelaudit

mldangelo · 2026-03-13T22:32:33Z

Summary\n- reuse the pickle scanner on object-dtype .npy payloads by scanning the remaining file stream instead of copying the payload into memory\n- preserve CVE-2019-6446 warnings while surfacing nested pickle findings for malicious object arrays and object members inside .npz archives\n- add NumPy and ZIP regression coverage for malicious, benign, truncated, and corrupted samples\n\n## Validation\n- uv run pytest tests/scanners/test_numpy_scanner.py -q\n- uv run pytest tests/scanners/test_zip_scanner.py -q -k "npz or pickle"\n- uv run ruff format modelaudit/ tests/\n- uv run ruff check --fix modelaudit/ tests/\n- uv run ruff check modelaudit/ tests/\n- uv run ruff format --check modelaudit/ tests/\n- uv run mypy modelaudit/\n- uv run pytest -n auto -m "not slow and not integration" --maxfail=1

Summary by CodeRabbit

Bug Fixes
- Scanner now recursively detects pickle payloads inside NumPy object-dtype arrays and within NPZ/ZIP members, improving CVE-2019-6446 detection while preserving archive-member location details.
- Consolidation now keeps findings distinct per archive member so issues report correct member locations.
Documentation
- Changelog updated with the new security entry.
Tests
- Added extensive tests covering object-dtype/NPZ scenarios, CVE details, corrupted files, and member-specific reporting.

coderabbitai · 2026-03-13T22:33:02Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 99e851ef-d196-4da7-bce5-2e806539147b

📥 Commits

Reviewing files that changed from the base of the PR and between f42f854 and 432c71e.

📒 Files selected for processing (5)

CHANGELOG.md
modelaudit/scanners/zip_scanner.py
tests/scanners/test_numpy_scanner.py
tests/scanners/test_zip_scanner.py
tests/test_core_asset_extraction.py

Walkthrough

Adds recursive scanning of embedded pickle payloads in object-dtype .npy and .npz members, rewrites nested-archive result contexts to preserve member locations, and groups checks by archive member when present; includes tests and a changelog entry describing preservation of CVE-2019-6446 warnings.

Changes

Cohort / File(s)	Summary
Changelog `CHANGELOG.md`	Documents new security entry noting recursion into object-dtype `.npy` payloads and `.npz` object members while preserving CVE-2019-6446 warnings and archive-member context.
NumPy scanner `modelaudit/scanners/numpy_scanner.py`	Adds `_scan_embedded_pickle_payload()` and integrates PickleScanner invocation for object-dtype payloads; seeks to data offsets and merges embedded ScanResult issues/checks into the NumPy ScanResult.
ZIP scanner internals `modelaudit/scanners/zip_scanner.py`	Adds `_rewrite_nested_result_context()` to normalize locations/details for nested scans and apply `zip_entry` and `archive_location` annotations; replaces inline rewriting at nested scan call sites.
Core grouping logic `modelaudit/core.py`	Updates `_group_checks_by_asset_` to include `zip_entry` (when present) in the asset grouping key so checks from different archive members are kept distinct.
NumPy scanner tests `tests/scanners/test_numpy_scanner.py`	Adds extensive tests for object-dtype pickle payload recursion, CVE-2019-6446 detection, dtype safety, truncated/corrupt handling, NPZ member contexts, and benign/non-pickle cases.
ZIP scanner tests `tests/scanners/test_zip_scanner.py`	Adds tests asserting NPZ member recursion into PickleScanner and that findings preserve nested archive member locations; updates `__reduce__` return type annotations in test payloads.
Core asset consolidation tests `tests/test_core_asset_extraction.py`	Adds tests ensuring consolidation keeps distinct NPZ member findings and nested-member findings distinct with correct `<archive>:member` location and `zip_entry` detail.

Sequence Diagram

sequenceDiagram
    participant Client
    participant ZipScanner
    participant NumPyScanner
    participant PickleScanner
    participant FileIO as File I/O
    participant Core
    Client->>ZipScanner: scan(archive)
    ZipScanner->>ZipScanner: extract member (e.g., payload.npy)
    ZipScanner->>NumPyScanner: scan(extracted_file, context=archive/member)
    NumPyScanner->>FileIO: read headers / dtype info
    alt object-dtype found
        NumPyScanner->>FileIO: seek(data_offset)
        NumPyScanner->>PickleScanner: scan(payload_stream, context_path)
        PickleScanner->>NumPyScanner: return ScanResult (issues/checks)
        NumPyScanner->>NumPyScanner: merge embedded results
    end
    NumPyScanner->>ZipScanner: return member ScanResult
    ZipScanner->>ZipScanner: _rewrite_nested_result_context(nested_result, tmp, archive, entry)
    ZipScanner->>Core: submit(scan_result)
    Core->>Client: consolidated ScanResult (grouped by archive/member)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I peer into arrays where odd bits hide,
I follow object burrows byte by byte,
Through .npy and .npz I hop and trace,
Marking parcels with careful grace,
A little rabbit keeping your code polite.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 29.41% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title 'fix: recurse into NumPy object pickle payloads' directly and accurately describes the main change: adding pickle payload scanning recursion into NumPy object-dtype arrays and NPZ members.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/numpy-object-pickle-recursion

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/scanners/test_numpy_scanner.py`:
- Around line 107-120: Add Python type hints to the new helpers: annotate
_ExecPayload.__reduce__ and _SSLPayload.__reduce__ to return a
Tuple[Callable[..., Any], Tuple[Any, ...]] and annotate their self parameter as
usual; import the needed typing names (e.g., Callable, Tuple, Any). Also
annotate _failed_checks to accept result: Any (or the specific result type if
available) and return List[Any] (or List[CheckType] if you have a Check type);
import List/Any as needed. Ensure all new function/method signatures use these
type annotations to satisfy the repository typing rule.
- Around line 123-174: Add a regression test that ensures embedding a single
comment token does not bypass detection: create a new test (e.g.,
test_object_with_comment_token_still_detected) following the pattern of
test_object_dtype_numpy_recurses_into_pickle_exec and
test_object_npz_member_recurses_into_pickle_exec_with_member_context, but
construct the malicious payload (using _ExecPayload or _SSLPayload) with a
single comment token embedded inside its payload representation; save it to a
.npy and to a .npz member as appropriate, run NumPyScanner and ZipScanner
(NumPyScanner.scan and ZipScanner().scan), then assert the same detections as
the other tests (presence of "CVE-2019-6446" and "exec" or
"ssl.get_server_certificate" in check messages, and for zip case check the
zip_entry/member context like result.issues[*].details.get("zip_entry") ==
"payload.npy"). Ensure the new test mirrors existing assertion styles and uses
_failed_checks/result.issues checks so the single-comment-token case is covered.
- Around line 149-159: Update test_numeric_npz_has_no_pickle_recursion_findings
to also assert that result.issues contains no pickle/exec findings: mirror the
existing negative assertions on result.checks by adding the same two checks
against result.issues (use the same string checks for "CVE-2019-6446" in
(issue.name + issue.message) and "exec" in issue.message.lower()). This ensures
the test fails if the scanner reports findings as issues rather than checks.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b3dbca4b-8cc3-4ee5-8d0f-800d35dbd8c2

📥 Commits

Reviewing files that changed from the base of the PR and between b7cb604 and 407652a.

📒 Files selected for processing (1)

tests/scanners/test_numpy_scanner.py

coderabbitai · 2026-03-13T22:53:54Z

tests/scanners/test_numpy_scanner.py

+class _ExecPayload:
+    def __reduce__(self):
+        return (exec, ("print('owned')",))
+
+
+class _SSLPayload:
+    def __reduce__(self):
+        import ssl
+
+        return (ssl.get_server_certificate, (("example.com", 443),))
+
+
+def _failed_checks(result):
+    return [c for c in result.checks if c.status.value == "failed"]


⚠️ Potential issue | 🟠 Major

Add type hints to newly introduced helpers.

The new helper methods/functions are untyped, which violates the repository Python typing rule.

Proposed fix

+from typing import Any, Callable + class _ExecPayload: - def __reduce__(self): + def __reduce__(self) -> tuple[Callable[..., object], tuple[str]]: return (exec, ("print('owned')",)) @@ class _SSLPayload: - def __reduce__(self): + def __reduce__(self) -> tuple[Callable[..., object], tuple[tuple[str, int]]]: import ssl return (ssl.get_server_certificate, (("example.com", 443),)) @@ -def _failed_checks(result): +def _failed_checks(result: Any) -> list[Any]: return [c for c in result.checks if c.status.value == "failed"]

As per coding guidelines: "Always include type hints in Python code".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/scanners/test_numpy_scanner.py` around lines 107 - 120, Add Python type hints to the new helpers: annotate _ExecPayload.__reduce__ and _SSLPayload.__reduce__ to return a Tuple[Callable[..., Any], Tuple[Any, ...]] and annotate their self parameter as usual; import the needed typing names (e.g., Callable, Tuple, Any). Also annotate _failed_checks to accept result: Any (or the specific result type if available) and return List[Any] (or List[CheckType] if you have a Check type); import List/Any as needed. Ensure all new function/method signatures use these type annotations to satisfy the repository typing rule.

tests/scanners/test_numpy_scanner.py

…ickle-recursion' into review-pr-699-head

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (3)

tests/scanners/test_numpy_scanner.py (3)
157-159: ⚠️ Potential issue | 🟡 Minor

Strengthen the negative NPZ test to assert against result.issues too.

Right now the regression can pass even if findings move from checks to issues. Add parallel negative assertions on result.issues for CVE/exec indicators.
Proposed fix
     assert not any("CVE-2019-6446" in (c.name + c.message) for c in result.checks)
     assert not any("exec" in c.message.lower() for c in result.checks)
+    assert not any(i.details.get("cve_id") == "CVE-2019-6446" for i in result.issues)
+    assert not any("exec" in i.message.lower() for i in result.issues)
Based on learnings: "Preserve or strengthen security detections; test both benign and malicious samples when adding scanner/feature changes".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/scanners/test_numpy_scanner.py` around lines 157 - 159, The test
currently only asserts negatives against result.checks; extend it to also assert
the same negatives against result.issues to prevent regressions that move
findings from checks to issues. Update the assertions after the scanner run so
you have parallel negative checks: assert not any("CVE-2019-6446" in (i.name +
i.message) for i in result.issues) and assert not any("exec" in
i.message.lower() for i in result.issues), using the existing result variable to
mirror the checks assertions (result.checks) with result.issues.
123-174: ⚠️ Potential issue | 🟠 Major

Add an object-dtype comment-token bypass regression case.

The new recursion tests cover malicious payloads, but there is still no explicit single-comment-token bypass case for this object-dtype path.

As per coding guidelines: "Test bypass prevention by verifying that embedding a single comment token in a malicious payload does NOT suppress detection".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/scanners/test_numpy_scanner.py` around lines 123 - 174, Add a
regression test that ensures a single comment-token inside an object-dtype
pickle payload does not bypass detection: create a new test (e.g.,
test_object_dtype_numpy_comment_token_bypass) that mirrors
test_object_dtype_numpy_recurses_into_pickle_exec but uses an _ExecPayload
variant containing a single comment token embedded in the payload, save with
np.save(..., allow_pickle=True), run NumPyScanner().scan on the file, and assert
via _failed_checks(result) that the CVE-2019-6446 finding and the "exec"
indication are still present (same style of assertions as in
test_object_dtype_numpy_recurses_into_pickle_exec and
test_object_npz_member_recurses_into_pickle_exec_with_member_context); reference
_ExecPayload, NumPyScanner, _failed_checks and ensure the test explicitly checks
that the comment token does not suppress detection.
108-116: 🛠️ Refactor suggestion | 🟠 Major

Type hints are still missing on new payload helpers.

Line 108 and Line 113 add untyped __reduce__ methods; these should be annotated.

As per coding guidelines: "Always include type hints in Python code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/scanners/test_numpy_scanner.py` around lines 108 - 116, Add explicit
return type annotations to the new __reduce__ methods: import typing names
(e.g., from typing import Any, Callable, Tuple) and annotate both helpers'
signatures to return Tuple[Callable[..., Any], Tuple[Any, ...]] (e.g., def
__reduce__(self) -> Tuple[Callable[..., Any], Tuple[Any, ...]]:). Keep existing
bodies (including the inline ssl import in _SSLPayload.__reduce__) and ensure
the typing imports are added at top of the test module if not already present.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/scanners/test_zip_scanner.py`:
- Around line 320-322: The test class _ExecPayload defines __reduce__ without a
type annotation; add a return type hint to __reduce__ (and annotate self if
desired) to satisfy typing rules—e.g., import typing and annotate __reduce__ to
return a Tuple[typing.Callable[..., typing.Any], tuple] or typing.Any so the
signature becomes def __reduce__(self) -> Tuple[Callable[..., Any], tuple]: (or
an appropriate typing.Any alternative) while keeping the existing return value
unchanged.

In `@tests/test_core_asset_extraction.py`:
- Around line 79-81: The test helper ExecPayload.__reduce__ is missing a return
type annotation; update the method signature for ExecPayload.__reduce__ to
include a precise typing return such as -> Tuple[Callable[..., Any], Tuple[Any,
...]] and add the necessary typing imports (Tuple, Callable, Any) at the top of
the file so the repository typing rule is satisfied while preserving the
existing returned value (exec, ("print('owned')",)).

---

Duplicate comments:
In `@tests/scanners/test_numpy_scanner.py`:
- Around line 157-159: The test currently only asserts negatives against
result.checks; extend it to also assert the same negatives against result.issues
to prevent regressions that move findings from checks to issues. Update the
assertions after the scanner run so you have parallel negative checks: assert
not any("CVE-2019-6446" in (i.name + i.message) for i in result.issues) and
assert not any("exec" in i.message.lower() for i in result.issues), using the
existing result variable to mirror the checks assertions (result.checks) with
result.issues.
- Around line 123-174: Add a regression test that ensures a single comment-token
inside an object-dtype pickle payload does not bypass detection: create a new
test (e.g., test_object_dtype_numpy_comment_token_bypass) that mirrors
test_object_dtype_numpy_recurses_into_pickle_exec but uses an _ExecPayload
variant containing a single comment token embedded in the payload, save with
np.save(..., allow_pickle=True), run NumPyScanner().scan on the file, and assert
via _failed_checks(result) that the CVE-2019-6446 finding and the "exec"
indication are still present (same style of assertions as in
test_object_dtype_numpy_recurses_into_pickle_exec and
test_object_npz_member_recurses_into_pickle_exec_with_member_context); reference
_ExecPayload, NumPyScanner, _failed_checks and ensure the test explicitly checks
that the comment token does not suppress detection.
- Around line 108-116: Add explicit return type annotations to the new
__reduce__ methods: import typing names (e.g., from typing import Any, Callable,
Tuple) and annotate both helpers' signatures to return Tuple[Callable[..., Any],
Tuple[Any, ...]] (e.g., def __reduce__(self) -> Tuple[Callable[..., Any],
Tuple[Any, ...]]:). Keep existing bodies (including the inline ssl import in
_SSLPayload.__reduce__) and ensure the typing imports are added at top of the
test module if not already present.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3f2cd2db-d2f7-44ca-998d-ea2bdebde626

📥 Commits

Reviewing files that changed from the base of the PR and between 407652a and f42f854.

📒 Files selected for processing (5)

modelaudit/core.py
modelaudit/scanners/zip_scanner.py
tests/scanners/test_numpy_scanner.py
tests/scanners/test_zip_scanner.py
tests/test_core_asset_extraction.py

tests/scanners/test_zip_scanner.py

tests/test_core_asset_extraction.py

…on' into review-pr-699-head

fix: recurse into numpy object pickle payloads

b7cb604

mldangelo added 2 commits March 13, 2026 18:49

test: type annotate numpy recursion regressions

407652a

fix(numpy): preserve npz member check context

56782ee

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

mldangelo added 2 commits March 13, 2026 18:54

Merge remote-tracking branch 'refs/remotes/origin/feat/numpy-object-p…

fa0abda

…ickle-recursion' into review-pr-699-head

test: format asset extraction regressions

f42f854

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

tests/scanners/test_zip_scanner.py Show resolved Hide resolved

tests/test_core_asset_extraction.py Show resolved Hide resolved

mldangelo added 2 commits March 14, 2026 03:07

Merge origin/main into feat/numpy-object-pickle-recursion

d45df12

Merge remote-tracking branch 'origin/feat/numpy-object-pickle-recursi…

432c71e

…on' into review-pr-699-head

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: recurse into NumPy object pickle payloads#699

fix: recurse into NumPy object pickle payloads#699
mldangelo wants to merge 7 commits intomainfrom
feat/numpy-object-pickle-recursion

mldangelo commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 13, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 13, 2026

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mldangelo commented Mar 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mldangelo commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 13, 2026 •

edited

Loading