Skip to content

feat(hooks): add V4 verification gate for DeepSeek V4 subagent results#5437

Open
EvangelosMoschou wants to merge 1 commit into
code-yeongyu:devfrom
EvangelosMoschou:feat/v4-verification-gate
Open

feat(hooks): add V4 verification gate for DeepSeek V4 subagent results#5437
EvangelosMoschou wants to merge 1 commit into
code-yeongyu:devfrom
EvangelosMoschou:feat/v4-verification-gate

Conversation

@EvangelosMoschou

@EvangelosMoschou EvangelosMoschou commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Problem

DeepSeek V4 has a 94% hallucination rate (AA-Omniscience). When Sisyphus runs on V4 and delegates via task()/call_omo_agent, the subagent results need explicit verification — V4 may hallucinate that verification passed, accepting fabricated results as fact.

Root Cause

The existing hooks don't have any model-specific verification logic. When a subagent returns, the output is passed directly to Sisyphus without any reminder to verify. On V4, this means Sisyphus might accept hallucinated results without checking.

Fix

New v4-verification-gate hook (ToolGuard tier) that:

  1. Tracks session model from message.updated events
  2. On task/call_omo_agent completion, if the session's model is a DeepSeek V4 model, appends a verification reminder to the tool output
  3. Non-V4 sessions are completely unaffected (zero overhead)

The reminder text:

--- V4 VERIFICATION REQUIRED ---
DeepSeek V4 has a 94% hallucination rate. Inspect touched files and rerun checks before accepting these results.

TDD Evidence

5 tests covering all paths:

  • V4 model + task tool -> reminder appended
  • V4 model + call_omo_agent tool -> reminder appended
  • Non-V4 model + task tool -> no reminder
  • V4 model + non-delegation tool (read) -> no reminder
  • Unknown session (no cached model) -> no reminder

All existing tool-guard composition tests (1) and tool-execute-after handler tests (6) still pass.

Research backing

From the librarian research on external MiMo/DeepSeek/MiniMax harnesses:

  • OpenSymphony experiment: MiniMax M3 "converges under review" — review pressure improves output quality
  • Maestro project: DeepSeek V4 Pro is "surprisingly competent when given a clear spec" but requires auto-review
  • MiMo Code: Goal verifier mechanism prevents premature "done" declarations

This hook implements a lightweight version of the "auto-review gate" pattern: instead of running automated checks, it reminds the orchestrator (Sisyphus on V4) to verify subagent results before accepting them.

Files

  • packages/omo-opencode/src/hooks/v4-verification-gate/hook.ts (new) — hook implementation
  • packages/omo-opencode/src/hooks/v4-verification-gate/index.ts (new) — barrel export
  • packages/omo-opencode/src/hooks/v4-verification-gate/hook.test.ts (new) — 5 tests
  • packages/omo-opencode/src/hooks/index.ts — barrel export added
  • packages/omo-opencode/src/plugin/hooks/create-tool-guard-hooks.ts — hook registered
  • packages/omo-opencode/src/config/schema/hooks.ts — hook name added to HookNameSchema

Usage

The hook is always on (can be disabled via disabled_hooks):

{
  "disabled_hooks": ["v4-verification-gate"]
}

Complements PR #5403 (DeepSeek V4 Sisyphus prompt + earlier compaction threshold).

Conflict note

This PR and #5438 (V4 checkpoint writer) both modify the same 3 files: hooks.ts, hooks/index.ts, and create-tool-guard-hooks.ts. Whichever merges first, the other will need a rebase to resolve merge conflicts in those files. If you'd like me to batch-rebase one onto the other before merging, let me know.


Summary by cubic

Adds a v4-verification-gate ToolGuard hook that appends a verification reminder to task and call_omo_agent outputs when the session uses a DeepSeek V4 model. This reduces the risk of accepting hallucinated subagent results; non‑V4 sessions are unaffected.

  • New Features

    • Added v4-verification-gate hook in packages/omo-opencode.
    • Tracks the session model from message.updated events.
    • Appends a verification note on task and call_omo_agent completion for DeepSeek V4 models only.
    • Registered in create-tool-guard-hooks and added to HookNameSchema.
  • Migration

    • No action required; enabled by default.
    • To disable: add "v4-verification-gate" to disabled_hooks.

Written for commit 8eb4c25. Summary will update on new commits.

Review in cubic

DeepSeek V4 has a 94% hallucination rate (AA-Omniscience). When Sisyphus
runs on V4 and delegates via task()/call_omo_agent, the subagent results
need explicit verification — V4 may hallucinate that verification passed.

This hook:
- Tracks session model from message.updated events
- On task/call_omo_agent completion, if session model is V4, appends
  a verification reminder to the tool output
- Non-V4 sessions are unaffected (no overhead)
- Gated behind disabled_hooks as 'v4-verification-gate'

TDD: 5 tests (V4+task, V4+call_omo_agent, non-V4, V4+non-delegation, unknown session).
All existing tool-guard and tool-execute-after tests still pass.
@github-actions github-actions Bot added the opencode OpenCode edition: packages/omo-opencode label Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

opencode OpenCode edition: packages/omo-opencode

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant