Skip to content

feat: support for gpt-oss in interactive chat#79

Merged
madclaws merged 1 commit intomainfrom
harmony-support
Jan 29, 2026
Merged

feat: support for gpt-oss in interactive chat#79
madclaws merged 1 commit intomainfrom
harmony-support

Conversation

@madclaws
Copy link
Copy Markdown
Member

fix: refactor

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 29, 2026

📝 Walkthrough

Walkthrough

The pull request extends the chat workflow in the Rust runtime to support memory-mode-aware behavior throughout the response handling pipeline. Changes update function signatures to accept RunArgs, implement conditional response parsing and output logic based on memory mode settings, add streaming delta tracking for selective rendering, and modify default modelfile resolution. The Python change is purely formatting with no behavioral impact.

Sequence Diagram

sequenceDiagram
    participant User
    participant ChatFn as chat()
    participant Streaming as Streaming Handler
    participant Conversion as convert_to_chat_response()
    participant Extraction as extract_reply()
    
    User->>ChatFn: Input + RunArgs
    activate ChatFn
    ChatFn->>Streaming: Stream response with memory_mode
    activate Streaming
    Streaming->>Streaming: Track is_answer_start flag
    alt memory_mode enabled
        Streaming->>User: Print full response
    else memory_mode disabled
        Streaming->>User: Print dimmed deltas until marker
        Streaming->>User: Print normal deltas after marker
    end
    Streaming-->>ChatFn: Final content
    deactivate Streaming
    ChatFn->>Conversion: Content + memory_mode
    activate Conversion
    Conversion->>Extraction: Content + memory_mode
    activate Extraction
    alt memory_mode enabled
        Extraction->>Extraction: Extract reply tag content
    else memory_mode disabled
        Extraction->>Extraction: Extract final answer (marker-based)
    end
    Extraction-->>Conversion: Parsed reply
    deactivate Extraction
    Conversion-->>ChatFn: ChatResponse
    deactivate Conversion
    ChatFn-->>User: ChatResponse
    deactivate ChatFn
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description 'fix: refactor' is vague and generic, using non-descriptive terms that don't convey meaningful information about the specific changes in the changeset. Expand the description to explain the refactoring intent, such as: 'Refactor chat workflow to support gpt-oss model by adding memory_mode parameter handling and updating default modelfile resolution.'
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: support for gpt-oss in interactive chat' directly aligns with the main change in the changeset, which involves updating default modelfile resolution to use gpt-oss instead of mem-agent.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@madclaws madclaws linked an issue Jan 29, 2026 that may be closed by this pull request
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tiles/src/runtime/mlx.rs (1)

487-533: Answer‑marker detection can miss when the marker is split across stream chunks.

SSE deltas can split **[Answer]** across chunks, so checking only the current delta can leave is_answer_start false and keep all output dimmed. Consider detecting on the accumulated buffer (or a small rolling window) to handle chunk boundaries.

🛠️ Suggested fix
-                if !run_args.memory && delta.contains("**[Answer]**") {
-                    is_answer_start = true;
-                }
+                if !run_args.memory && !is_answer_start && accumulated.contains("**[Answer]**") {
+                    is_answer_start = true;
+                }

@madclaws madclaws merged commit 0a23e73 into main Jan 29, 2026
2 checks passed
@madclaws madclaws deleted the harmony-support branch January 29, 2026 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate gpt-oss as the default model in Alpha

1 participant