Skip to content

fix agent-audit scan crash#4

Open
007gzs wants to merge 7 commits into
HeadyZhang:masterfrom
007gzs:patch-1
Open

fix agent-audit scan crash#4
007gzs wants to merge 7 commits into
HeadyZhang:masterfrom
007gzs:patch-1

Conversation

@007gzs

@007gzs 007gzs commented Apr 1, 2026

Copy link
Copy Markdown

agent-audit scan will crash with SKILL.md:

---
name:
description:
metadata:
---

Summary by CodeRabbit

  • Bug Fixes

    • Improved metadata handling during audits: the system now validates metadata types and preserves original metadata when structures are unexpected, preventing erroneous or malformed extraction of OpenClaw-specific details.
  • Tests

    • Added edge-case tests to ensure non-dictionary or malformed frontmatter (empty, string-encoded, or list formats) do not produce false security findings.

agent-audit scan will crash with SKILL.md:
```
---
name:
description:
metadata:
---
```
@coderabbitai

coderabbitai Bot commented Apr 1, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@007gzs has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 41 minutes and 57 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 41 minutes and 57 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 327f9274-e06f-4faa-9b45-a02f4787a310

📥 Commits

Reviewing files that changed from the base of the PR and between 058e033 and 3bb97e1.

📒 Files selected for processing (1)
  • tests/test_scanners/test_skill_meta_scanner.py
📝 Walkthrough

Walkthrough

The _get_openclaw_meta function now validates frontmatter types and traverses keys ("metadata","openclaw") stepwise: non-dict frontmatter returns {}, missing or non-dict intermediate values return the original frontmatter, and when both intermediate and final values are dicts it returns the openclaw dict.

Changes

Cohort / File(s) Summary
Skill meta scanner
packages/audit/agent_audit/scanners/skill_meta_scanner.py
Replaced nested .get(...).get(...) with stepwise traversal over keys ("metadata","openclaw"), validating each step is a dict. Returns {} for non-dict frontmatter; returns original frontmatter if traversal fails to yield a dict.
Scanner tests
tests/test_scanners/test_skill_meta_scanner.py
Added three edge-case tests for scanning SKILL.md with non-dict metadata values (empty value, JSON-encoded string, YAML list) asserting no scan findings are produced.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hopped through keys with cautious delight,
Checking each box, avoiding a fright.
When maps are gone or not what they seem,
I return what I found and tidy the scheme. 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix agent-audit scan crash' directly addresses the main objective—fixing a crash when processing SKILL.md files with empty YAML frontmatter values. It is concise, clear, and accurately reflects the primary change.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/audit/agent_audit/scanners/skill_meta_scanner.py`:
- Around line 170-177: The current traversal over frontmatter (using nested =
frontmatter and iterating keys ("metadata","openclaw")) correctly descends into
nested dicts but then returns the original frontmatter, causing later calls
(e.g., meta.get("persistence")) to miss nested values; change the final return
to return the resolved nested dict when traversal succeeds (i.e., return nested)
while keeping the existing type checks and early returns intact so non-dict or
missing paths still return frontmatter or {} as intended.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1d1e5a1a-fc72-4880-ab94-2fac565a69a8

📥 Commits

Reviewing files that changed from the base of the PR and between a9e176f and bcd6456.

📒 Files selected for processing (1)
  • packages/audit/agent_audit/scanners/skill_meta_scanner.py

Comment thread packages/audit/agent_audit/scanners/skill_meta_scanner.py Outdated
@HeadyZhang

Copy link
Copy Markdown
Owner

Thanks for catching this @007gzs — this is a legitimate crash path. Empty
YAML values parsing to None instead of dict is a realistic input, especially
for SKILL.md files generated from templates where fields get left blank.

The fix looks good after the latest commit (returning nested instead of
frontmatter when traversal succeeds). One thing before merge:

Could you add a test case? The repo's convention is that any crash fix
gets a regression test so we don't reintroduce it. Something like this in
tests/scanners/test_skill_meta_scanner.py (or wherever the existing tests
for this module live):

​```python
def test_get_openclaw_meta_none_metadata():
"""Empty YAML frontmatter parses metadata as None — should not crash."""
frontmatter = {"name": None, "description": None, "metadata": None}
result = _get_openclaw_meta(frontmatter)
assert isinstance(result, dict)

def test_get_openclaw_meta_empty_dict():
assert _get_openclaw_meta({}) == {}

def test_get_openclaw_meta_non_dict_input():
assert _get_openclaw_meta(None) == {}

def test_get_openclaw_meta_valid_path():
frontmatter = {"metadata": {"openclaw": {"persistence": "session"}}}
result = _get_openclaw_meta(frontmatter)
assert result.get("persistence") == "session"
​```

One design question worth confirming: when the metadata.openclaw path
doesn't resolve (e.g. metadata exists but openclaw is missing), the
current code returns the full frontmatter dict. Is that intentional, or
should it return {} for consistency with the non-dict case? Callers doing
meta.get("persistence") would get different behavior depending on which
failure path they hit. I'd lean toward always returning {} on any
resolution failure so the contract is "returns openclaw metadata dict, or
empty dict" — but happy to hear your thinking.

Once the test is added, this is good to merge. Thanks again.

@007gzs

007gzs commented Apr 24, 2026

Copy link
Copy Markdown
Author

One design question worth confirming: when the metadata.openclaw path doesn't resolve (e.g. metadata exists but openclaw is missing), the current code returns the full frontmatter dict. Is that intentional, or should it return {} for consistency with the non-dict case? Callers doing meta.get("persistence") would get different behavior depending on which failure path they hit. I'd lean toward always returning {} on any resolution failure so the contract is "returns openclaw metadata dict, or empty dict" — but happy to hear your thinking.

I implemented this based on the original code below. In the previous code, if metadata did not contain openclaw, then nested would become {}, and since if nested would be False, it would ultimately return frontmatter.

        nested = (
            frontmatter.get("metadata", {}).get("openclaw", {})
        )
        if nested:
            return nested
        return frontmatter

@HeadyZhang

Copy link
Copy Markdown
Owner

Good point, you're right that preserving the original contract is the correct scope for a crash fix. I missed that the pre-existing behavior was already "return frontmatter when nested resolution fails," so your implementation correctly keeps that semantics. The contract question is really a pre-existing design issue, not something this PR should tackle.
I'll open a separate issue for that.

Just the test case then, and this is good to merge. Thanks for the careful read.

Add tests for skill files with and without metadata.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_scanners/test_skill_meta_scanner.py`:
- Around line 423-494: The tests incorrectly assert len(results) == 0 and
duplicate a test name, causing failures and one test to be overridden; update
each new test that calls self.scanner.scan(...) (the tests exercising
_scan_skill_file and scan() which return a SkillMetaScanResult) to assert
len(results) == 1 and then assert results[0].security_findings == [] (or
appropriate expectations), and rename the second test_skill_file_metadata_str to
a unique name (e.g., test_skill_file_metadata_list) so both the JSON-string
metadata test and the list-format metadata test run independently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8caba34f-1c0a-4b67-9c44-6ea823579a5d

📥 Commits

Reviewing files that changed from the base of the PR and between bcd6456 and 058e033.

📒 Files selected for processing (2)
  • packages/audit/agent_audit/scanners/skill_meta_scanner.py
  • tests/test_scanners/test_skill_meta_scanner.py

Comment thread tests/test_scanners/test_skill_meta_scanner.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants