feat(hooks): add V4 verification gate for DeepSeek V4 subagent results#5437
Open
EvangelosMoschou wants to merge 1 commit into
Open
feat(hooks): add V4 verification gate for DeepSeek V4 subagent results#5437EvangelosMoschou wants to merge 1 commit into
EvangelosMoschou wants to merge 1 commit into
Conversation
DeepSeek V4 has a 94% hallucination rate (AA-Omniscience). When Sisyphus runs on V4 and delegates via task()/call_omo_agent, the subagent results need explicit verification — V4 may hallucinate that verification passed. This hook: - Tracks session model from message.updated events - On task/call_omo_agent completion, if session model is V4, appends a verification reminder to the tool output - Non-V4 sessions are unaffected (no overhead) - Gated behind disabled_hooks as 'v4-verification-gate' TDD: 5 tests (V4+task, V4+call_omo_agent, non-V4, V4+non-delegation, unknown session). All existing tool-guard and tool-execute-after tests still pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
DeepSeek V4 has a 94% hallucination rate (AA-Omniscience). When Sisyphus runs on V4 and delegates via
task()/call_omo_agent, the subagent results need explicit verification — V4 may hallucinate that verification passed, accepting fabricated results as fact.Root Cause
The existing hooks don't have any model-specific verification logic. When a subagent returns, the output is passed directly to Sisyphus without any reminder to verify. On V4, this means Sisyphus might accept hallucinated results without checking.
Fix
New
v4-verification-gatehook (ToolGuard tier) that:message.updatedeventstask/call_omo_agentcompletion, if the session's model is a DeepSeek V4 model, appends a verification reminder to the tool outputThe reminder text:
TDD Evidence
5 tests covering all paths:
tasktool -> reminder appendedcall_omo_agenttool -> reminder appendedtasktool -> no reminderread) -> no reminderAll existing tool-guard composition tests (1) and tool-execute-after handler tests (6) still pass.
Research backing
From the librarian research on external MiMo/DeepSeek/MiniMax harnesses:
This hook implements a lightweight version of the "auto-review gate" pattern: instead of running automated checks, it reminds the orchestrator (Sisyphus on V4) to verify subagent results before accepting them.
Files
packages/omo-opencode/src/hooks/v4-verification-gate/hook.ts(new) — hook implementationpackages/omo-opencode/src/hooks/v4-verification-gate/index.ts(new) — barrel exportpackages/omo-opencode/src/hooks/v4-verification-gate/hook.test.ts(new) — 5 testspackages/omo-opencode/src/hooks/index.ts— barrel export addedpackages/omo-opencode/src/plugin/hooks/create-tool-guard-hooks.ts— hook registeredpackages/omo-opencode/src/config/schema/hooks.ts— hook name added toHookNameSchemaUsage
The hook is always on (can be disabled via
disabled_hooks):{ "disabled_hooks": ["v4-verification-gate"] }Complements PR #5403 (DeepSeek V4 Sisyphus prompt + earlier compaction threshold).
Conflict note
This PR and #5438 (V4 checkpoint writer) both modify the same 3 files:
hooks.ts,hooks/index.ts, andcreate-tool-guard-hooks.ts. Whichever merges first, the other will need a rebase to resolve merge conflicts in those files. If you'd like me to batch-rebase one onto the other before merging, let me know.Summary by cubic
Adds a
v4-verification-gateToolGuard hook that appends a verification reminder totaskandcall_omo_agentoutputs when the session uses a DeepSeek V4 model. This reduces the risk of accepting hallucinated subagent results; non‑V4 sessions are unaffected.New Features
v4-verification-gatehook inpackages/omo-opencode.message.updatedevents.taskandcall_omo_agentcompletion for DeepSeek V4 models only.create-tool-guard-hooksand added toHookNameSchema.Migration
"v4-verification-gate"todisabled_hooks.Written for commit 8eb4c25. Summary will update on new commits.