fix: fail closed on malformed STACK_GLOBAL operands by mldangelo · Pull Request #704 · promptfoo/modelaudit

mldangelo · 2026-03-13T23:06:03Z

Summary

fail closed on malformed STACK_GLOBAL operands when the symbolic stack resolves non-string values or missing memo references
keep simple stack-underflow/truncation cases on the existing informational path instead of promoting them into malformed findings
add regressions for integer operands, dangerous module hints, missing memo lookups, safe and dangerous memoized cases, and multi-stream interactions

Validation

uv run pytest tests/scanners/test_pickle_scanner.py -q -k "stack_global or malformed_stack_global or truncated_stack_global"
uv run ruff format modelaudit/ tests/
uv run ruff check --fix modelaudit/ tests/
uv run ruff check modelaudit/ tests/
uv run ruff format --check modelaudit/ tests/
uv run mypy modelaudit/
uv run pytest -n auto -m "not slow and not integration" --maxfail=1

Summary by CodeRabbit

New Features
- STACK_GLOBAL findings now include reasons (insufficient context, missing memo, non-string operand) and concise operand previews for clearer context.
Bug Fixes
- Better handling of malformed STACK_GLOBAL operands to avoid unsafe parsing and correctly separate informational vs. critical findings.
- Malformed data in one stream no longer taints subsequent streams; REDUCE and other dangerous-target detections remain effective.
Tests
- Expanded coverage for STACK_GLOBAL variants, multi-stream parsing, and risky-reference detection.

coderabbitai · 2026-03-13T23:06:30Z

Walkthrough

Adds detection and reporting for malformed STACK_GLOBAL operands: a new classifier produces bounded previews for non-string/missing-memo/insufficient-context operands, _build_symbolic_reference_maps now returns a third malformed_stack_globals map, and callers/tests are updated to propagate and assert this data.

Changes

Cohort / File(s)	Summary
Changelog `CHANGELOG.md`	Added security note documenting fail-closed behavior for malformed `STACK_GLOBAL` operands (missing memo or non-string); truncation-only cases remain informational.
Core Scanner `modelaudit/scanners/pickle_scanner.py`	Added `_classify_stack_global_operand` and bounded operand preview formatting. `_build_symbolic_reference_maps` return type extended to include `malformed_stack_globals` (opcode index → detail dict). Updated STACK_GLOBAL, memo handling, and downstream reporting to consume malformed info.
Tests `tests/scanners/test_pickle_scanner.py`	Added extensive tests covering malformed `STACK_GLOBAL` cases, multi-stream parsing/resync, missing vs memoized operands, REDUCE interactions, truncation/underflow behavior, and regression coverage for various dangerous patterns and modules.

Sequence Diagram

sequenceDiagram
    participant Parser as Pickle Parser
    participant SG as STACK_GLOBAL Processor
    participant Classifier as Operand Classifier
    participant Tracker as Malformed Tracker
    participant Analysis as Threat Analysis

    Parser->>SG: Encounter STACK_GLOBAL opcode (module, name)
    SG->>Classifier: Provide operands and memo context
    Classifier->>Classifier: Determine kind (string, missing_memo, non-string, insufficient_context) and bounded preview
    Classifier->>Tracker: Record malformed_stack_globals entry (index → {kind, preview, reason})
    SG->>Tracker: Add valid stack_global_refs / callable_refs
    Tracker->>Analysis: Supply valid refs + malformed_stack_globals
    Analysis->>Analysis: Adjust findings and severities based on classifications
    Analysis->>Parser: Emit security findings

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 In jars of pickle bytes I peep,
I spot the globals malformed or asleep.
Missing memos, odd types I show—
I hop, I flag, I let you know.
A rabbit guard in code's soft glow.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main objective of the changeset: implementing fail-closed behavior for malformed STACK_GLOBAL operands, which is the core security fix across all modified files.
Docstring Coverage	✅ Passed	Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/pickle-stack-global-fail-closed

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

CHANGELOG.md (1)
82-148: ⚠️ Potential issue | 🟡 Minor

Consolidate duplicate ### Fixed sections.

The [Unreleased] section contains two ### Fixed subsections (lines 82 and 123). Per Keep a Changelog format, each section type should appear only once per release version.

Please consolidate these into a single ### Fixed section. As per coding guidelines, CHANGELOG entries should follow Keep a Changelog format.

📝 Suggested consolidation approach

Merge all entries from both ### Fixed sections into one section. The consolidated section should appear after ### Added and before ### Security to follow conventional ordering (Added, Changed, Deprecated, Removed, Fixed, Security).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CHANGELOG.md` around lines 82 - 148, The changelog has two duplicate "###
Fixed" subsections under the [Unreleased] release; consolidate them by merging
all bullet entries from both "### Fixed" blocks into a single "### Fixed"
section, remove the duplicate header, and place that consolidated "### Fixed"
section after the "### Added" section and before the "### Security" section so
the release follows the Keep a Changelog ordering; ensure no bullets are lost or
duplicated and preserve their original wording and bullet order.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 1948-1955: _classify_stack_global_operand currently builds
"non_string" previews using the full repr(value), which can blow up memory for
large operands; change it to produce a bounded preview: compute a short preview
string (e.g., limit repr(value) to a safe max length and append "..." when
truncated) and for binary types (bytes/bytearray/memoryview) produce a concise
size/hex-prefix like "bytes(len=NN, hex=0xDEAD...)" instead of full content;
apply the same truncation strategy to the other analogous classifier in the file
to avoid large allocations across the scanner.

---

Outside diff comments:
In `@CHANGELOG.md`:
- Around line 82-148: The changelog has two duplicate "### Fixed" subsections
under the [Unreleased] release; consolidate them by merging all bullet entries
from both "### Fixed" blocks into a single "### Fixed" section, remove the
duplicate header, and place that consolidated "### Fixed" section after the "###
Added" section and before the "### Security" section so the release follows the
Keep a Changelog ordering; ensure no bullets are lost or duplicated and preserve
their original wording and bullet order.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f93a213a-55a5-4ee0-8962-12bfce964b24

📥 Commits

Reviewing files that changed from the base of the PR and between 698859d and 19feb72.

📒 Files selected for processing (3)

CHANGELOG.md
modelaudit/scanners/pickle_scanner.py
tests/scanners/test_pickle_scanner.py

modelaudit/scanners/pickle_scanner.py

coderabbitai

♻️ Duplicate comments (1)

modelaudit/scanners/pickle_scanner.py (1)

1949-1956: ⚠️ Potential issue | 🟠 Major

Bound operand previews in malformed STACK_GLOBAL classification

Line 1956 still builds non_string payloads from full repr(value). Large operands can bloat memory and finding payload size.

💡 Proposed fix

 def _classify_stack_global_operand(value: Any) -> tuple[str, str]:
     if isinstance(value, str):
         return "string", value
     if value is missing_memo:
         return "missing_memo", "unknown"
     if value is unknown:
         return "unknown", "unknown"
-    return "non_string", f"{type(value).__name__}:{value!r}"
+    if isinstance(value, (bytes, bytearray, memoryview)):
+        return "non_string", f"{type(value).__name__}:<{len(value)} bytes>"
+    preview = repr(value)
+    if len(preview) > 256:
+        preview = preview[:256] + "...<truncated>"
+    return "non_string", f"{type(value).__name__}:{preview}"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` around lines 1949 - 1956,
_classify_stack_global_operand currently returns a full repr for non-string
values which can create huge payloads; change the non_string branch to truncate
the preview (introduce a constant like STACK_GLOBAL_PREVIEW_MAX) and return a
bounded preview plus the original value size/length instead of the full repr
(e.g. compute preview = repr(value)[:STACK_GLOBAL_PREVIEW_MAX], compute size =
len(repr(value)) or an appropriate length metric, then return "non_string",
f"{type(value).__name__}:{preview} (len={size})"). Ensure the new logic is used
in _classify_stack_global_operand so large operands no longer blow up payloads.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 1949-1956: _classify_stack_global_operand currently returns a full
repr for non-string values which can create huge payloads; change the non_string
branch to truncate the preview (introduce a constant like
STACK_GLOBAL_PREVIEW_MAX) and return a bounded preview plus the original value
size/length instead of the full repr (e.g. compute preview =
repr(value)[:STACK_GLOBAL_PREVIEW_MAX], compute size = len(repr(value)) or an
appropriate length metric, then return "non_string",
f"{type(value).__name__}:{preview} (len={size})"). Ensure the new logic is used
in _classify_stack_global_operand so large operands no longer blow up payloads.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a42eb735-7984-4f8f-bca9-ff143a67bd89

📥 Commits

Reviewing files that changed from the base of the PR and between 19feb72 and 90e4cc8.

📒 Files selected for processing (3)

CHANGELOG.md
modelaudit/scanners/pickle_scanner.py
tests/scanners/test_pickle_scanner.py

mldangelo · 2026-03-14T07:01:29Z

Rebased onto current main, addressed the malformed operand preview issue, and reran validation locally.

What changed in this update:

bound malformed STACK_GLOBAL operand previews so non-string diagnostics no longer serialize full repr(value)
emit concise binary previews like bytes(len=..., hex=0x...) for bytes/bytearray/memoryview operands
add a regression with a 200 KB BINBYTES operand to prove the emitted context-check detail stays compact

Why this mattered:

before this fix, a malformed payload with a large binary operand produced a STACK_GLOBAL Context Check whose detail string was roughly 200 KB long
after the fix, the same payload emits a bounded preview (bytes(len=200000, hex=0x7878787878787878...)), with a 151-character message and a 44-character operand detail

Validation passed:

uv run pytest tests/scanners/test_pickle_scanner.py -q -k "stack_global and (non_string or missing_memo or large_bytes_operand_preview or truncated_stack_global_context or malformed_stack_global)"
uv run ruff format modelaudit/ tests/
uv run ruff check --fix modelaudit/ tests/
uv run ruff check modelaudit/ tests/
uv run ruff format --check modelaudit/ tests/
uv run mypy modelaudit/
uv run pytest -n auto -m "not slow and not integration" --maxfail=1 -> 2182 passed, 57 skipped

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

modelaudit/scanners/pickle_scanner.py (1)

1997-2001: 🧹 Nitpick | 🔵 Trivial

Consolidate duplicated STOP handling blocks in symbolic simulation.

After Line 1997’s STOP branch continue, the later STOP branches are unreachable. Keeping three reset paths in one loop increases drift risk if only one branch is edited later.

♻️ Suggested cleanup

@@
-        if name == "STOP":
-            # Reset stack and memo at pickle stream boundaries so that
-            # references from a previous stream cannot leak into the next
-            # one (multi-stream / appended-pickle scenarios).
-            stack.clear()
-            memo.clear()
-            next_memo_index = 0
-            continue
@@
-        if name == "STOP":
-            # Reset memo and stack at pickle stream boundaries so that
-            # references from one stream don't leak into the next in
-            # multi-pickle files (e.g. PyTorch .pt containers).
-            memo.clear()
-            stack.clear()
-            next_memo_index = 0

Also applies to: 2147-2154, 2187-2194

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` around lines 1997 - 2001, Multiple
identical 'if name == "STOP":' branches reset stack, memo, and next_memo_index
and then continue, causing duplicate/unreachable code; consolidate these into a
single canonical STOP handler inside the main symbolic-simulation loop. Locate
all occurrences of the STOP check (the conditional testing name == "STOP" that
clears stack, memo and resets next_memo_index) and replace duplicate blocks with
one shared handler (keep the clear() calls on stack and memo and set
next_memo_index = 0, then continue), removing the later redundant branches so
edits won't drift out of sync; ensure any local control flow dependent on the
original positions still behaves correctly after centralizing.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 1997-2001: Multiple identical 'if name == "STOP":' branches reset
stack, memo, and next_memo_index and then continue, causing
duplicate/unreachable code; consolidate these into a single canonical STOP
handler inside the main symbolic-simulation loop. Locate all occurrences of the
STOP check (the conditional testing name == "STOP" that clears stack, memo and
resets next_memo_index) and replace duplicate blocks with one shared handler
(keep the clear() calls on stack and memo and set next_memo_index = 0, then
continue), removing the later redundant branches so edits won't drift out of
sync; ensure any local control flow dependent on the original positions still
behaves correctly after centralizing.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: dd505563-fa0b-4873-8713-e47b6eb9d739

📥 Commits

Reviewing files that changed from the base of the PR and between 90e4cc8 and a92e66f.

📒 Files selected for processing (3)

CHANGELOG.md
modelaudit/scanners/pickle_scanner.py
tests/scanners/test_pickle_scanner.py

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

modelaudit/scanners/pickle_scanner.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

mldangelo added 3 commits March 14, 2026 02:57

fix: fail closed on malformed stack globals

8c566ae

fix: bound malformed stack global previews

2aa6b86

fix: satisfy typing for stack global previews

a92e66f

mldangelo force-pushed the feat/pickle-stack-global-fail-closed branch from 90e4cc8 to a92e66f Compare March 14, 2026 07:01

coderabbitai bot reviewed Mar 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fail closed on malformed STACK_GLOBAL operands#704

fix: fail closed on malformed STACK_GLOBAL operands#704
mldangelo wants to merge 3 commits intomainfrom
feat/pickle-stack-global-fail-closed

mldangelo commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

mldangelo commented Mar 14, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mldangelo commented Mar 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

mldangelo commented Mar 14, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mldangelo commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 13, 2026 •

edited

Loading