Summary
AUTO mode currently has classifier-driven approval and denial tracking, but it still needs stronger policy boundaries around agent self-modification and attempts to bypass an AUTO denial.
The main risk is that a model can modify files that affect its own later behavior, permissions, tools, commands, hooks, MCP configuration, or startup instructions while running in AUTO mode. A related risk is that, after an AUTO denial, the model may try to complete the same denied action through another tool or an indirect path.
Current Behavior
AUTO mode has an accept-edits fast path for in-workspace Edit and Write calls when the path does not match the current persistence path patterns. That protects some persistence-related paths, but the protected set does not yet cover several Qwen-specific self-modification surfaces.
Examples of paths that should not be silently allowed through the accept-edits fast path:
.qwen/settings*.json
- user-level
~/.qwen/settings*.json
QWEN.md
AGENTS.md
.qwen/commands/
.qwen/agents/
.qwen/skills/
.qwen/hooks/
.mcp.json
The classifier prompt also does not currently have explicit categories for self-modification or denial bypass. When an AUTO block is returned to the model as a tool error, the model may continue by trying a different tool or indirection to achieve the same denied target.
Proposed Behavior
Strengthen AUTO mode with two explicit safety boundaries:
-
Self-modification boundary
AUTO mode should not fast-path modifications to files or directories that can affect Qwen Code behavior, permissions, startup context, commands, hooks, agents, skills, MCP servers, or project/user instructions.
-
Denial-bypass boundary
After AUTO mode denies an action, the model should not retry the same action or use another tool, shell indirection, generated script, symlink, alias, config change, hook, command file, MCP configuration, encoded payload, or equivalent path to complete the same denied action.
If the denied action is required to satisfy the user request, the model should stop and ask the user for explicit approval instead of attempting a workaround.
Suggested Implementation
- Add a centralized self-modification path safety check for AUTO edit/write fast paths.
- Check both the original requested path and resolved symlink target where possible.
- Exclude self-modification paths from the accept-edits fast path and route them through classifier/manual approval instead.
- Add classifier prompt categories for:
Self-Modification
Auto-Mode Bypass
- Improve the AUTO denial tool-result message so the model is explicitly told not to retry or work around the denied action.
- Add a short main prompt rule: if a tool call is denied, do not retry the same action or attempt an equivalent workaround; ask the user if the action is required.
Acceptance Criteria
Edit / Write to ordinary files inside the workspace can still use the AUTO accept-edits fast path.
Edit / Write to .qwen/settings*.json, QWEN.md, AGENTS.md, .qwen/commands/, .qwen/agents/, .qwen/skills/, .qwen/hooks/, or .mcp.json does not use the accept-edits fast path.
- Symlinks targeting protected self-modification paths do not use the accept-edits fast path.
- Classifier prompts explicitly define and block unsafe self-modification and AUTO denial bypass attempts.
- AUTO denial messages instruct the model not to retry the same denied action or work around it through another tool.
- Tests cover protected Qwen paths, symlink targets, ordinary workspace writes, and denial-bypass prompt/message behavior.
Notes
This is related to existing AUTO mode denial tracking and observability work, but it should be handled as a separate policy-hardening change rather than telemetry-only work.
Summary
AUTO mode currently has classifier-driven approval and denial tracking, but it still needs stronger policy boundaries around agent self-modification and attempts to bypass an AUTO denial.
The main risk is that a model can modify files that affect its own later behavior, permissions, tools, commands, hooks, MCP configuration, or startup instructions while running in AUTO mode. A related risk is that, after an AUTO denial, the model may try to complete the same denied action through another tool or an indirect path.
Current Behavior
AUTO mode has an accept-edits fast path for in-workspace
EditandWritecalls when the path does not match the current persistence path patterns. That protects some persistence-related paths, but the protected set does not yet cover several Qwen-specific self-modification surfaces.Examples of paths that should not be silently allowed through the accept-edits fast path:
.qwen/settings*.json~/.qwen/settings*.jsonQWEN.mdAGENTS.md.qwen/commands/.qwen/agents/.qwen/skills/.qwen/hooks/.mcp.jsonThe classifier prompt also does not currently have explicit categories for self-modification or denial bypass. When an AUTO block is returned to the model as a tool error, the model may continue by trying a different tool or indirection to achieve the same denied target.
Proposed Behavior
Strengthen AUTO mode with two explicit safety boundaries:
Self-modification boundary
AUTO mode should not fast-path modifications to files or directories that can affect Qwen Code behavior, permissions, startup context, commands, hooks, agents, skills, MCP servers, or project/user instructions.
Denial-bypass boundary
After AUTO mode denies an action, the model should not retry the same action or use another tool, shell indirection, generated script, symlink, alias, config change, hook, command file, MCP configuration, encoded payload, or equivalent path to complete the same denied action.
If the denied action is required to satisfy the user request, the model should stop and ask the user for explicit approval instead of attempting a workaround.
Suggested Implementation
Self-ModificationAuto-Mode BypassAcceptance Criteria
Edit/Writeto ordinary files inside the workspace can still use the AUTO accept-edits fast path.Edit/Writeto.qwen/settings*.json,QWEN.md,AGENTS.md,.qwen/commands/,.qwen/agents/,.qwen/skills/,.qwen/hooks/, or.mcp.jsondoes not use the accept-edits fast path.Notes
This is related to existing AUTO mode denial tracking and observability work, but it should be handled as a separate policy-hardening change rather than telemetry-only work.