Skip to content

feat: add structured agent audit prompt pack#1354

Closed
huangrichao2020 wants to merge 3 commits intoFoundationAgents:mainfrom
huangrichao2020:codex/agent-audit-prompt-pack
Closed

feat: add structured agent audit prompt pack#1354
huangrichao2020 wants to merge 3 commits intoFoundationAgents:mainfrom
huangrichao2020:codex/agent-audit-prompt-pack

Conversation

@huangrichao2020
Copy link
Copy Markdown

@huangrichao2020 huangrichao2020 commented Apr 23, 2026

Summary

  • add app/prompt/agent_audit.py, a reusable prompt pack for structured agent-runtime audits
  • include playbooks, advanced playbooks, rubric guidance, report schema, and example report data in code
  • add tests that validate the required audit artifacts, schema shape, playbook coverage, and prompt builder output
  • mention the prompt pack in the README

What this adds

This PR adds a reusable prompt pack for diagnosing agent runtime failures in OpenManus.

The prompt pack is designed for cases such as:

  • wrapper regression
  • tool-discipline failures
  • stale memory contamination
  • hidden repair or retry layers
  • rendering / transport mutation

It asks the agent to build structured audit artifacts before writing a final diagnosis:

  1. agent_check_scope.json
  2. evidence_pack.json
  3. failure_map.json
  4. agent_check_report.json

The module also includes:

  • standard playbooks
  • advanced playbooks
  • an audit rubric
  • a report schema
  • an example structured report
  • build_agent_audit_prompt() for composing the prompt with a selected playbook

How to use it

This PR does not add a new CLI command. Instead, it adds a reusable prompt pack that developers can attach to an agent, flow, evaluation script, or debugging workflow.

Example usage in code:

from app.prompt.agent_audit import build_agent_audit_prompt

audit_prompt = build_agent_audit_prompt("tool-discipline")

The returned prompt can be used as a system prompt or supplemental instruction for an OpenManus agent:

from app.prompt.agent_audit import build_agent_audit_prompt

agent.system_prompt = build_agent_audit_prompt("wrapper-regression")

Supported playbooks include:

  • wrapper-regression
  • memory-contamination
  • tool-discipline
  • rendering-transport
  • hidden-agent-layers

A user-facing prompt could look like:

Use the agent-audit prompt pack to audit this agent runtime.
Focus on tool-discipline failures and stale evidence reuse.
Build agent_check_scope.json, evidence_pack.json, failure_map.json, and agent_check_report.json before giving the final diagnosis.

Expected final output includes:

  • executive verdict
  • severity-ranked findings
  • conflict map
  • contamination paths
  • ordered fix plan

Why this fits OpenManus

OpenManus is a general agent framework, and runtime reliability issues often come from the orchestration around the model rather than the model itself. This prompt pack gives users and developers a reusable, evidence-first way to inspect those layers without changing the core agent loop.

I kept this as a prompt module and tests instead of adding a new runtime dependency or tool surface.

Verification

I ran the following locally:

cd /tmp/openmanus-agent-audit && uv venv .venv --python 3.12
cd /tmp/openmanus-agent-audit && uv pip install pytest black==23.1.0 isort==5.12.0 autoflake==2.0.1 pre-commit==4.4.0
cd /tmp/openmanus-agent-audit && .venv/bin/python -m isort app/prompt/agent_audit.py tests/test_agent_audit_prompt.py
cd /tmp/openmanus-agent-audit && .venv/bin/python -m black app/prompt/agent_audit.py tests/test_agent_audit_prompt.py
cd /tmp/openmanus-agent-audit && .venv/bin/python -m pytest tests/test_agent_audit_prompt.py -q
cd /tmp/openmanus-agent-audit && .venv/bin/python -m py_compile app/prompt/agent_audit.py tests/test_agent_audit_prompt.py

Results:

  • 4 passed
  • py_compile passed

@huangrichao2020 huangrichao2020 closed this by deleting the head repository Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant