feat(openaicompat): reasoning_content + provider quirks for Kimi K2.5 & GLM 5.0#1
Open
danshapiro wants to merge 13 commits into
Open
feat(openaicompat): reasoning_content + provider quirks for Kimi K2.5 & GLM 5.0#1danshapiro wants to merge 13 commits into
danshapiro wants to merge 13 commits into
Conversation
Extends the openaicompat adapter with: - reasoning_content parsing (non-streaming and streaming) - reasoning_content round-tripping in outgoing messages - Usage.ReasoningTokens extraction and estimation - ProviderQuirks system for per-provider behavioral overrides - Named presets for kimi-k2.5 and glm-5 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix two bugs in the openaicompat-reasoning plan: 1. FinishReasonMap implementation: the original approach passed the mapped value through NormalizeFinishReason, which would overwrite .Raw with the mapped value instead of preserving the original provider value (e.g., "sensitive"). The fix constructs FinishReason explicitly when a quirk mapping is applied, preserving the original raw value for diagnostics. 2. Streaming reasoning-to-tool-call transition: added logic to close REASONING_END before the first tool call starts, matching Anthropic's event ordering. Without this, reasoning would remain "open" until [DONE] when tool calls follow reasoning with no text in between (common for Kimi K2.5). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…provider quirks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ntent Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…n outgoing messages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ter support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…schema, and finish reason quirks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…v var config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d verify reasoning in final response Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…narios Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
obra
added a commit
that referenced
this pull request
May 11, 2026
The past-viewer transcript pane was empty because serveReplay decoded
top-level {kind, data} fields, but transcript entries are
{kind:"entry", seq, turn:{kind, message, ...}} — the data field was
always missing. Even with the right schema, transcript turns are
message-shaped objects, not the runtime SSE event stream the renderer
listens for.
Translate each turn into the equivalent renderer events:
header → SESSION_START; USER_INPUT → USER_INPUT;
ASSISTANT text part → ASSISTANT_TEXT_START + ASSISTANT_TEXT_END;
ASSISTANT tool_call part → TOOL_CALL_START;
TOOL_RESULTS → TOOL_CALL_END (using a call_id → tool_name map carried
across turns).
Closes kata #1.
obra
added a commit
that referenced
this pull request
May 11, 2026
…ests
The previous rendering produced three redundant elements per task
transition: a steering "current task" divider, a tool-call card
("tasks update · 1 updated"), and a body row ("✓ #1 → done"). A
reader couldn't follow the narrative. This rewrite is a synthesis of
six parallel design treatments (ambient-strip, timeline-lane,
status-line-min, collapsible-group, action-narrative, kanban-beam),
landing on the action-narrative direction.
Conversation transcript changes:
- "current task" steering is suppressed entirely (it fired before
nearly every turn and duplicated information available elsewhere).
Its task title still seeds the description cache so subsequent
updates can name the task.
- task_list tool calls render as a single line of system prose:
"marked \"Understand task\" done"
"added 3 tasks: \"Phase 1\" · \"Phase 2\" · \"Phase 3\""
"marked \"X\" done — took longer than expected, started \"Y\""
Multiple updates collapse into one comma-joined sentence (cap 4).
action=view is suppressed entirely.
- full-list steering (post-compaction) collapses to one
"task list reloaded · N items" pointer that opens the side panel.
- loop / read-only / all-done / unknown steering still render as
the existing collapsible divider (these are exceptional events).
Tasks button changes:
- Shows a live progress badge next to the icon (e.g. "3/7"),
updated by a 5s background poller and immediately after each
task_list call. Visible without opening the panel.
Tasks side panel changes:
- EVERY task is expandable now (not just notes-having ones). Click
a row to reveal a metadata <dl> with type, status, depends_on,
reasoning_effort, full prompt, and notes log. Chevron rotates on
open. Description stays in the row summary.
Description cache:
- Client-side Map seeded from steering full-list parses,
task_list-append args (best-effort), and the /tasks poll. Used
by the system-line renderer to print "marked \"X\" done" instead
of "marked #N done". Falls back to "#N" when uncached.
JSDOM tests under cmd/serf-hub/jstest:
- test-renderer.js: replays the captured event sequence from
Jesse's screenshot scenario; asserts 0 steering, 0 tool-call
cards, 1 system-line ("marked \"Understand task\" done").
- test-renderer-advanced.js: append (3 tasks named), multi-update
with notes, full-list pointer, loop steering kept, view action
suppressed, missing-description #N fallback.
- test-sidebar.js: every task expandable; metadata dl populated
with prompt/type/depends_on/reasoning_effort/notes; badge shows
1/3 for the test fixture.
obra
added a commit
that referenced
this pull request
Jun 4, 2026
…te (PRI-1975) Done-bar #1 is "go test -race ./... green as a PERMANENT gate." The module is race-clean (Phase 0 + the two hub-race fixes), but CI ran `go test -short` without -race, so a future race could land green. Switch the CI test step to `go test -race -short -count=1 ./...` so the race detector runs on every build. -short is kept (CI skips resource-dependent tests); the full `go test -race ./...` remains the local dev gate. Verified `go test -race -short ./...` green locally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
obra
added a commit
that referenced
this pull request
Jun 7, 2026
Final review caught three salvage artifacts in MODEL-FACING rendered text (the design doc was right; the render strings were stale): the elided-turns marker pointed at find_session_transcripts(transcript_ref) for an outline (find takes no ref — that's read format=outline); the header said format=transcript_jsonl (now jsonl); the hard-cap note said full_result_for (now expand_turn). Tests asserting the old strings updated. #1 fired on the default read of any long session. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
obra
added a commit
that referenced
this pull request
Jun 8, 2026
A practical, example-driven guide for authoring hooks in a serf plugin: discovery (--plugin-dir, manifest path / inline / default hooks.json), the nine events serf fires and their tiers, matcher semantics with worked examples, the command handler (shell vs exec-form args, shell selection, env vars), the input JSON fields, exit-code semantics per event, the Go RE2 caveat, the loud misconfiguration warnings, and a copy-pasteable example hooks.json. The shell->Bash tool-name caveat is called out prominently as 'the #1 mistake'. Linked from 07 (the contract) and from architecture.md's hooks-package bullet.
obra
added a commit
that referenced
this pull request
Jun 8, 2026
…broken anchor Adversarial review found, and I verified against the code, three real defects: - hooks.md listed `stopReason` in the "fields serf honors" block, but the output parser never parses it — `parsedHookOutput` has no such field (the only StopReason in the tree is the goal subsystem). It was wrongly lifted from old 07's *reserved* universal-JSON block into the *shipped* list. Removed it. `continue`/`suppressOutput` are assigned to `parsedHookOutput` but absent from `RunResult`/`PreToolUseResult`, so they never reach a caller — reframed both docs to say they are parsed but not acted on, not honored. - The "Output is capped at 10,000 characters" claim has no basis: no length cap exists anywhere in the hook output path. Removed from hooks.md and 07. - The shell->Bash gotcha anchor `#the-1-mistake-shellbash` is wrong; the GitHub slug for "## The #1 mistake: `shell` -> `Bash`" is `#the-1-mistake-shell--bash` (the arrow between spaces yields a double hyphen, matching the Phase-B/C convention already used). Fixed all four links (3 in hooks.md, 1 new cross-doc link in 07).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
reasoning_contentin non-streaming and streaming Chat Completions responses, emitting unifiedContentThinking/StreamEventReasoning*typesreasoning_contentin outgoing assistant messages for multi-turn reasoning preservationreasoning_tokensfromcompletion_tokens_details; fall back to chars/4 estimationProviderQuirkssystem with named presets (kimi-k2.5,glm-5) configurable viaOPENAI_COMPATIBLE_PROVIDER_QUIRKSenv varTest plan
go vet ./llm/providers/openaicompat/clean🤖 Generated with Claude Code