Skip to content

Commit 57e8e5b

Browse files
committed
feat(cli)!: add agent-first chat output contract
1 parent 6ff0f7b commit 57e8e5b

15 files changed

Lines changed: 817 additions & 177 deletions

File tree

cli/AGENTS.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -77,11 +77,13 @@ token (emitting as a JSON array is planned for v0.8).
7777
`detail` carries structured per-error context (e.g. `unknown_subcommand`'s
7878
`available[]` list).
7979

80-
### NDJSON event stream (chat / session ask)
80+
### Buffered JSON and NDJSON streams (chat / session ask)
8181

82-
`--format json` and `--format ndjson` both produce one JSON event per line —
83-
no envelope wrapping. The CLI injects exactly one event (`init`) at the head;
84-
all subsequent events pass through verbatim from the SDK:
82+
`--format json` (the default) buffers the SSE stream and emits one normal
83+
success envelope containing the final answer, slim references, session
84+
pointers, and optional thinking (`--verbose`). `--format ndjson` is the raw
85+
event/debug surface: the CLI injects exactly one `init` event at the head and
86+
passes all subsequent SDK events through verbatim:
8587

8688
```
8789
{"type":"init","session_id":"...","kb_id":"...","profile":"...","agent_id":"..."}
@@ -170,7 +172,7 @@ is or isn't aligned with.
170172

171173
| | |
172174
|---|---|
173-
| **WeKnora** | streaming commands (`chat`, `session ask`) emit bare `{type:...}` per line; no envelope |
175+
| **WeKnora** | `chat` / `session ask --format ndjson` emit bare `{type:...}` per line; default JSON buffers the stream into one envelope |
174176
| **Rationale** | This matches established practice across NDJSON-emitting CLIs and webhook protocols. A streaming envelope requires unwrap before dispatch — net burden with no benefit. |
175177

176178
### 5. No `schema_version` field in payload

cli/CHANGELOG.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,28 @@ CLI history before v0.3 is recorded in the project root
1212

1313
## [Unreleased]
1414

15+
### Breaking
16+
- `chat` and `session ask` now distinguish JSON from NDJSON: the default
17+
`--format json` buffers the stream into one `{ok:true,data:{...}}` envelope;
18+
use `--format ndjson` for the previous raw event stream.
19+
- Buffered JSON and MCP chat/session references clear full chunk `content`;
20+
fetch the passage on demand with `chunk view <parent_chunk_id>` (or the
21+
reference `id` when no parent is present).
22+
- `chat` / `session ask --format text` hide thinking and reflection by default;
23+
pass `--verbose` to include the reasoning trace.
24+
25+
### Added
26+
- `chat` / `session ask --verbose` includes thinking and reflection in buffered
27+
JSON and rendered text output.
28+
- Buffered chat/session errors include the auto-created `session_id` in
29+
`error.detail` so interrupted sessions remain recoverable.
30+
31+
### Changed
32+
- Buffered JSON/MCP references omit full chunk content while preserving chunk
33+
identifiers and citation metadata; fetch full passages with `chunk view`.
34+
- NDJSON remains an unmodified SDK event trace, including reasoning and full
35+
reference payloads.
36+
1537
### Fixed
1638
- Streaming SDK calls are no longer cut off by the client's default 30-second
1739
timeout (explicit `WithTimeout` values remain honored), and SSE data lines

cli/cmd/chat/chat.go

Lines changed: 142 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,24 @@
11
// Package chat implements `weknora chat <text>` - the streaming RAG answer
22
// entry point.
33
//
4-
// Two output modes share a single SDK call:
4+
// Three output modes share a single SDK call:
55
//
6-
// - Stream mode (TTY + --format text): write each StreamResponse.Content
7-
// fragment directly to iostreams.IO.Out as it arrives, then print a
8-
// footer with knowledge references. This is the "feels alive" UX a
9-
// human typing in a terminal expects.
6+
// - JSON mode (--format json, the default): accumulate the whole stream,
7+
// then emit one success envelope {ok, data:{answer, references,
8+
// session_id}}. The agent-recommended shape — parse one object
9+
// instead of reassembling an event stream.
1010
//
11-
// - NDJSON mode (--format json / --format ndjson / pipe): inject a CLI
12-
// "init" event at stream head, then pass through every SDK event verbatim
13-
// as NDJSON lines. Agents and pipes get a live event stream they can
14-
// parse incrementally. --format json routes here too — buffered JSON
15-
// envelope makes no sense for a streaming command.
11+
// - Text mode (--format text): write each StreamResponse.Content fragment
12+
// directly to iostreams.IO.Out as it arrives (TTY), then print a
13+
// references footer. The "feels alive" UX a human typing in a terminal
14+
// expects.
15+
//
16+
// - NDJSON mode (--format ndjson): inject a CLI "init" event at stream
17+
// head, then pass through every SDK event verbatim as NDJSON lines —
18+
// the raw protocol trace, for debugging / advanced consumers.
1619
//
1720
// The SDK's KnowledgeQAStream callback contract is invoked sequentially on
18-
// one goroutine, so neither mode needs locking. The runChat core takes a
21+
// one goroutine, so no mode needs locking. The runChat core takes a
1922
// ChatService interface so tests inject a fake without standing up a real
2023
// SSE server.
2124
package chat
@@ -35,19 +38,22 @@ import (
3538
sdk "github.com/Tencent/WeKnora/client"
3639
)
3740

38-
// chatFields enumerates the NDJSON init-event fields surfaced for
39-
// `--format json` / `--format ndjson` discovery on `chat`. Reflects the
40-
// InitEvent head line + the raw SDK event vocabulary.
41+
// chatFields enumerates the fields surfaced for `--jq` projection discovery
42+
// on `chat`: the json object's data fields + the raw SDK event vocabulary
43+
// used by --format ndjson.
4144
var chatFields = []string{
42-
"session_id", "kb_id",
43-
// SDK event fields (pass-through): response_type, content, done,
44-
// knowledge_references, assistant_message_id, session_id
45+
"answer", "references", "thinking", "session_id", "assistant_message_id",
46+
// NDJSON init + SDK event fields.
47+
"type", "kb_id", "profile", "response_type", "content", "done", "knowledge_references", "data",
4548
}
4649

4750
type Options struct {
4851
Query string
4952
KBID string
5053
SessionID string
54+
// Verbose surfaces the model's thinking/retrieval process in JSON/text.
55+
// NDJSON is always raw and is unaffected by presentation flags.
56+
Verbose bool
5157
}
5258

5359
// ChatService is the narrow SDK surface this command depends on. *sdk.Client
@@ -69,12 +75,15 @@ answer back. By default a fresh session is created on first invocation; pass
6975
--session to continue an existing conversation.
7076
7177
Modes:
78+
--format json (default): one JSON object {ok, data:{answer, references,
79+
session_id}} once the stream completes —
80+
parse one object, no event reassembly. The
81+
agent-recommended shape.
7282
--format text: live token streaming + reference footer
73-
--format json / --format ndjson / pipe (default): NDJSON event stream —
74-
one init line at head (session_id, kb_id),
75-
then raw SDK events verbatim. Both json
76-
and ndjson flags produce the same NDJSON
77-
stream.`,
83+
--format ndjson: raw NDJSON event stream — init line (session_id,
84+
kb_id) then SDK events verbatim. Debug / advanced.
85+
86+
Pass --verbose to include the model's thinking/reflection in JSON/text output.`,
7887
Example: ` weknora chat "What is RRF?" --kb a32a63ff-fb36-4874-bcaa-30f48570a694
7988
weknora chat "Summarise this design doc" --kb my-kb --format json
8089
weknora chat "Continue?" --session sess_abc`,
@@ -103,12 +112,16 @@ Modes:
103112
}
104113
cmdutil.AddKBFlag(cmd)
105114
cmd.Flags().StringVar(&opts.SessionID, "session", "", "Continue an existing chat session (skip auto-create)")
115+
cmd.Flags().BoolVar(&opts.Verbose, "verbose", false, "Include the model's thinking/retrieval process in JSON/text output")
106116
cmdutil.AddFormatFlag(cmd, chatFields...)
107117
cmdutil.SetAgentHelp(cmd, cmdutil.AgentHelp{
108-
UsedFor: "Ask a streaming RAG question against a knowledge base. Produces an NDJSON event stream: init line (session_id, kb_id) then raw SDK events. Use --format json or --format ndjson.",
118+
UsedFor: "Ask a RAG question against a knowledge base. Default (--format json) returns ONE JSON object {ok, data:{answer, references, session_id}} after the stream completes — parse one object, no event reassembly. --format ndjson streams raw SDK events; --format text streams live tokens. --verbose adds the model's thinking to JSON/text.",
109119
RequiredFlags: []string{"--kb"},
110-
Examples: []string{`weknora chat "What is RRF?" --kb kb_abc --format json`},
111-
Output: "NDJSON stream: {type:init, session_id, kb_id} then SDK events (response_type, content, done, knowledge_references, ...)",
120+
Examples: []string{
121+
`weknora chat "What is RRF?" --kb kb_abc`,
122+
`weknora chat "What is RRF?" --kb kb_abc --jq '.data.answer'`,
123+
},
124+
Output: "Default --format json: {ok, data:{answer, references[], session_id, thinking?}}. --format ndjson: {type:init, session_id, kb_id} then SDK events (response_type, content, done, knowledge_references, ...).",
112125
})
113126
return cmd
114127
}
@@ -128,11 +141,8 @@ func runChat(ctx context.Context, opts *Options, fopts *cmdutil.FormatOptions, s
128141
return cmdutil.NewError(cmdutil.CodeServerError, "chat: no SDK client available")
129142
}
130143

131-
// Streaming commands route --format json AND --format ndjson to the
132-
// NDJSON event-stream path. A buffered envelope makes no sense for a
133-
// streaming command. Only --format text uses the live renderer.
134-
ndjsonMode := fopts != nil && (fopts.Mode == cmdutil.FormatJSON || fopts.Mode == cmdutil.FormatNDJSON)
135-
144+
// --format selects the output shape: json (default) accumulates and
145+
// emits one object, ndjson streams raw SDK events, text renders live.
136146
sessionID := opts.SessionID
137147
autoCreated := false
138148
if sessionID == "" {
@@ -156,24 +166,26 @@ func runChat(ctx context.Context, opts *Options, fopts *cmdutil.FormatOptions, s
156166
autoCreated = true
157167
}
158168

159-
if ndjsonMode {
169+
if fopts != nil && fopts.Mode == cmdutil.FormatNDJSON {
160170
return runChatNDJSON(ctx, opts, sessionID, svc)
161171
}
172+
if fopts != nil && fopts.Mode == cmdutil.FormatJSON {
173+
return runChatJSON(ctx, opts, fopts, sessionID, svc)
174+
}
162175

163176
// Surface the auto-created session ID up-front so a user who hits ^C
164177
// mid-stream still has the pointer to resume - no need to scroll back
165-
// past tokens. Skipped in NDJSON mode (it appears in the init event).
178+
// past tokens. Skipped in json/ndjson mode (session_id is in the output).
166179
if autoCreated {
167180
fmt.Fprintf(iostreams.IO.Err, "session: %s (use --session to continue)\n", sessionID)
168181
}
169182

170183
return runChatText(ctx, opts, sessionID, autoCreated, svc)
171184
}
172185

173-
// runChatNDJSON handles --format json and --format ndjson paths.
174-
// Emits a CLI init event at stream head, then passes every SDK event through
175-
// verbatim as NDJSON lines. No buffering — callers parse the stream
176-
// incrementally.
186+
// runChatNDJSON handles the --format ndjson path: emits a CLI init event at
187+
// stream head, then passes every SDK event through verbatim as NDJSON lines.
188+
// No buffering — callers parse the stream incrementally.
177189
func runChatNDJSON(ctx context.Context, opts *Options, sessionID string, svc ChatService) error {
178190
w := iostreams.IO.Out
179191

@@ -198,6 +210,8 @@ func runChatNDJSON(ctx context.Context, opts *Options, sessionID string, svc Cha
198210
Channel: "api",
199211
}
200212
cb := func(r *sdk.StreamResponse) error {
213+
// NDJSON is the raw protocol/debug surface: do not filter events or
214+
// mutate their payloads. JSON/text modes own presentation filtering.
201215
return output.EmitSDKEvent(w, r)
202216
}
203217
if err := svc.KnowledgeQAStream(ctx, sessionID, req, cb); err != nil {
@@ -227,9 +241,14 @@ func runChatText(ctx context.Context, opts *Options, sessionID string, autoCreat
227241

228242
cb := func(r *sdk.StreamResponse) error {
229243
if streamMode && r != nil && r.Content != "" {
230-
// Best-effort write; if stdout dies the SDK will surface the
231-
// error on the next iteration. No need to bail early.
232-
_, _ = iostreams.IO.Out.Write([]byte(r.Content))
244+
// Default hides the reasoning pass (thinking/reflection); --verbose
245+
// streams it inline with the answer.
246+
isReasoning := r.ResponseType == sdk.ResponseTypeThinking || r.ResponseType == sdk.ResponseTypeReflection
247+
if !isReasoning || opts.Verbose {
248+
// Best-effort write; if stdout dies the SDK will surface the
249+
// error on the next iteration. No need to bail early.
250+
_, _ = iostreams.IO.Out.Write([]byte(r.Content))
251+
}
233252
}
234253
acc.Append(r)
235254
return nil
@@ -281,14 +300,96 @@ func runChatText(ctx context.Context, opts *Options, sessionID string, autoCreat
281300
fmt.Fprintln(out)
282301
}
283302
} else {
284-
fmt.Fprint(out, answer)
285-
if !strings.HasSuffix(answer, "\n") {
303+
rendered := answer
304+
if opts.Verbose {
305+
rendered = acc.Thinking() + rendered
306+
}
307+
fmt.Fprint(out, rendered)
308+
if !strings.HasSuffix(rendered, "\n") {
286309
fmt.Fprintln(out)
287310
}
288311
}
289312
format.WriteReferences(out, references)
290313
return nil
291314
}
292315

316+
// runChatJSON handles the --format json path (the default): accumulate the
317+
// full stream with no live output, then emit a single success envelope —
318+
// one object an agent parses instead of an NDJSON event stream to reassemble.
319+
// The envelope carries the answer, slim references (Content stripped — fetch
320+
// full passages via `chunk view <parent_chunk_id>`), optional thinking
321+
// (--verbose), and session pointers.
322+
func runChatJSON(ctx context.Context, opts *Options, fopts *cmdutil.FormatOptions, sessionID string, svc ChatService) error {
323+
req := &sdk.KnowledgeQARequest{
324+
Query: opts.Query,
325+
KnowledgeBaseIDs: []string{opts.KBID},
326+
AgentEnabled: false,
327+
WebSearchEnabled: false,
328+
Channel: "api",
329+
}
330+
331+
acc := &sse.Accumulator{}
332+
cb := func(r *sdk.StreamResponse) error {
333+
acc.Append(r)
334+
return nil
335+
}
336+
337+
if err := svc.KnowledgeQAStream(ctx, sessionID, req, cb); err != nil {
338+
if cmdutil.IsCancelled(ctx, err) {
339+
return chatStreamError(cmdutil.Wrapf(cmdutil.CodeOperationCancelled, err, "chat cancelled"), sessionID, acc.AssistantMessageID)
340+
}
341+
return chatStreamError(cmdutil.WrapHTTP(err, "knowledge qa stream"), sessionID, acc.AssistantMessageID)
342+
}
343+
if !acc.Done() {
344+
return chatStreamError(
345+
cmdutil.NewError(cmdutil.CodeSSEStreamAborted, "stream ended without a terminal event"),
346+
sessionID,
347+
acc.AssistantMessageID,
348+
)
349+
}
350+
351+
sid := acc.SessionID
352+
if sid == "" {
353+
sid = sessionID
354+
}
355+
data := chatResult{
356+
Answer: acc.Result(),
357+
References: acc.References,
358+
SessionID: sid,
359+
AssistantMessageID: acc.AssistantMessageID,
360+
}
361+
if opts.Verbose {
362+
data.Thinking = acc.Thinking()
363+
}
364+
365+
// Reaching here means fopts.Mode is FormatJSON (the only caller). Route
366+
// through FormatOptions.Emit so --jq projection and the success-envelope
367+
// contract apply, matching every other non-streaming command. A nil fopts
368+
// (direct-test entry) defaults to JSON so the object still emits.
369+
if fopts == nil {
370+
fopts = &cmdutil.FormatOptions{Mode: cmdutil.FormatJSON}
371+
}
372+
return fopts.Emit(iostreams.IO.Out, data, nil)
373+
}
374+
375+
func chatStreamError(err *cmdutil.Error, sessionID, assistantMessageID string) *cmdutil.Error {
376+
detail := map[string]any{"session_id": sessionID}
377+
if assistantMessageID != "" {
378+
detail["assistant_message_id"] = assistantMessageID
379+
}
380+
return err.WithDetail(detail)
381+
}
382+
383+
// chatResult is the --format json data payload: one object an agent parses
384+
// instead of an NDJSON stream. References are slim (Content stripped by the
385+
// accumulator). Wrapped by Emit as {ok:true, data:<chatResult>, meta}.
386+
type chatResult struct {
387+
Answer string `json:"answer"`
388+
References []*sdk.SearchResult `json:"references,omitempty"`
389+
Thinking string `json:"thinking,omitempty"`
390+
SessionID string `json:"session_id"`
391+
AssistantMessageID string `json:"assistant_message_id,omitempty"`
392+
}
393+
293394
// compile-time check: the production SDK client implements ChatService.
294395
var _ ChatService = (*sdk.Client)(nil)

0 commit comments

Comments
 (0)