Skip to content

[Feature]: agent-first output modes for chat / session ask #1739

Description

@coconut-yc

Affected Component

Other — weknora CLI and its MCP tools

Problem Description

On main at ae9038732ad2 (CLI v0.9.0), the output surfaces have three
different behaviors, but none provides a bounded event-shaped result for an
agent:

  • CLI --format json and --format ndjson are aliases: both emit a CLI init
    line followed by raw SDK events as NDJSON. --format json therefore does not
    produce one JSON document.
  • CLI --format text uses a separate accumulator/renderer. TTY output streams
    content, non-TTY output buffers it, and references/tool traces are appended as
    trailing sections.
  • MCP chat and session_ask already buffer the run, but return categorized
    {answer, thinking, tool_events, references} results and include full
    reference objects.

These surfaces do not serve their consumers cleanly:

  • An AI agent receives internal reasoning, tool activity, and full chunk
    contents even when it only needs the answer. Large reference events can
    consume a substantial part of its context window.
  • A human wants readable text as it is generated, not protocol objects.
  • A debugger or stream processor does need the complete SDK event trace.

The CLI should make those use cases explicit. JSON should be a bounded
agent-facing projection, text should be its human-readable streaming form, and
NDJSON should remain the full-fidelity protocol surface. JSON should continue
to expose events rather than inventing a categorized final-result schema.

Proposed Solution

--format json (default)

Buffer the response into one normal success envelope:

{
  "ok": true,
  "data": {
    "events": [
      {"id":"answer-1","response_type":"answer","content":"...","done":false}
    ],
    "session_id": "..."
  }
}

By default, include answer events and required session/request metadata only.
Do not include thinking, reflection, tool calls/results, or reference events.

--reference

Add reference events to the default projection without exposing reasoning or
tool activity. Each reference is a bounded lookup index:

{"kb_id":"...","chunk_id":"...","parent_chunk_id":"..."}

Full passages are fetched on demand through chunk view <chunk_id> or chunk view <parent_chunk_id>. session ask answer events carry inline chunk indexes,
so a consumer can also follow citations directly from the default answer.

--verbose

For JSON and text, opt into thinking, reflection, tool calls/results, and
lifecycle/metadata events. It does not implicitly add references; combine
--verbose --reference when both execution detail and provenance are needed.

--format text

Apply the same default/reference/verbose filtering as JSON, but render selected
events as live human-readable output. TTY detection may affect styling, not
content.

--format ndjson

Emit the CLI init event followed by every SDK frame verbatim, including full
reference payloads. --reference and --verbose do not modify this raw mode.

MCP

Return the same JSON events representation from chat and session_ask.
Add reference and verbose booleans with the same independent behavior.

The selection policy is therefore explicit and shared by JSON, text, and MCP:

Selection Included events
default answer
reference answer + indexed references
verbose answer + reasoning/tool/lifecycle detail
reference + verbose all projected detail, with references still index-only

NDJSON is outside this projection and always keeps the raw SDK frames.

Buffered CLI JSON errors retain session_id and, when available,
assistant_message_id so interrupted sessions can be resumed.

Alternatives

  • Keep JSON as an alias for NDJSON. This preserves the existing protocol
    output but provides no bounded default for agent consumers.
  • Keep the MCP-style categorized result for CLI JSON. That makes the CLI
    synthesize a second semantic model instead of exposing a filtered event
    projection.
  • Trim references on the server. This changes the protocol for every
    consumer, including the web UI. The CLI/MCP boundary is the appropriate place
    for an agent-facing projection.

Impact

  • Agent and MCP calls use substantially less context by default.
  • Reference provenance can be requested without also exposing reasoning/tools.
  • Humans get readable streaming text with the same selection controls.
  • Raw/debug consumers retain the complete NDJSON contract.
  • Existing consumers of default JSON must migrate from raw lines to
    data.events; existing MCP consumers must migrate from categorized fields to
    events.

Use Case

# Default bounded answer projection
weknora chat "How do retries work?" --kb engineering \
  --jq '[.data.events[].content] | join("")'

# Add reference indexes without execution detail
weknora chat "How do retries work?" --kb engineering --reference

# Add both execution detail and reference indexes
weknora session ask "Investigate the failure" --agent ag_x \
  --verbose --reference

# Full raw protocol
weknora session ask "Investigate the failure" --agent ag_x --format ndjson

Additional Information

The SSE timeout, Scanner limit, terminal-error handling, completion sentinel,
and missing reference identity fields are covered by
#1738. They are prerequisites
for reliably producing any buffered projection but are separate from this
output-contract change.

Confirmation

  • I have searched existing issues and confirmed this is a new request
  • I understand this request may need discussion and evaluation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions