Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 78 additions & 3 deletions crates/goose-cli/src/session/streaming_buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,81 @@
//! ```

use regex::Regex;
use std::io::Write;
use std::sync::LazyLock;

const MAX_CODE_BLOCK_LINES: usize = 50;
const TRUNCATED_SHOW_LINES: usize = 20;

fn truncate_code_blocks(content: &str) -> String {
let (open_pos, fence) = match (content.find("```"), content.find("~~~")) {
(Some(a), Some(b)) if a <= b => (a, "```"),
(Some(a), None) => (a, "```"),
(None, Some(b)) => (b, "~~~"),
(Some(_), Some(b)) => (b, "~~~"),
(None, None) => return content.to_string(),
};

let Some(after_open) = content.get(open_pos + 3..) else {
return content.to_string();
};
let Some(newline_pos) = after_open.find('\n') else {
return content.to_string();
};
let code_start = open_pos + 3 + newline_pos + 1;

let Some(code_region) = content.get(code_start..) else {
return content.to_string();
};
let close_pattern = format!("\n{}", fence);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match closing fence against opener length

truncate_code_blocks searches for the end fence using "\n```"/"\n~~~" regardless of how long the opening fence actually is, so content opened with longer fences (for example ````md blocks that embed inner ``` snippets) is treated as closed at the first inner triple fence. In that scenario, lines.len() is computed on only a prefix of the real block and oversized code blocks are left untruncated even though `check_code_fence` correctly supports longer fences.

Useful? React with 👍 / 👎.

let Some(close_offset) = code_region.find(&close_pattern) else {
Comment on lines +50 to +51
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Recognize indented closing fences when truncating blocks

The truncation scan only looks for "\n```"/"\n~~~", so it misses valid fenced blocks whose closing fence is indented (for example list-item code blocks ending with "\n ```"). In that case truncate_code_blocks returns the full content unmodified, which bypasses the new line cap for a common Markdown pattern even though check_code_fence already treats leading spaces as valid fence syntax.

Useful? React with 👍 / 👎.

return content.to_string();
};

let Some(code_content) = code_region.get(..close_offset) else {
return content.to_string();
};
let lines: Vec<&str> = code_content.lines().collect();

if lines.len() <= MAX_CODE_BLOCK_LINES {
return content.to_string();
Comment on lines +60 to +61
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Continue scanning after short first code block

truncate_code_blocks returns the original content as soon as the first fenced block has <= MAX_CODE_BLOCK_LINES, so any later oversized block is never truncated. In mixed responses (for example, a short snippet followed by a long generated file), this defeats the new 50-line cap and can still flood the CLI output, which is exactly the behavior this change intends to prevent.

Useful? React with 👍 / 👎.

}

let truncated: String = lines
.iter()
.take(TRUNCATED_SHOW_LINES)
.copied()
.collect::<Vec<_>>()
.join("\n");
let remaining = lines.len() - TRUNCATED_SHOW_LINES;

let file_msg = save_to_temp_file(code_content)
.map(|p| format!(" → {}", p))
.unwrap_or_default();

let close_pos = code_start + close_offset + 1; // +1 to include the \n
let prefix = content.get(..code_start).unwrap_or("");
let suffix = content.get(close_pos..).unwrap_or("");
format!(
"{}{}\n... ({} more lines{})\n{}",
prefix, truncated, remaining, file_msg, suffix
)
}

fn save_to_temp_file(content: &str) -> Option<String> {
let mut file = tempfile::Builder::new()
.prefix("goose-")
.suffix(".txt")
.tempfile()
.ok()?;

file.write_all(content.as_bytes()).ok()?;

// Keep the file (don't delete on drop) and get the path
let (_, path) = file.keep().ok()?;
Some(path.display().to_string())
}

/// Regex that tokenizes markdown inline elements.
/// Order matters: longer/more-specific patterns first.
static INLINE_TOKEN_RE: LazyLock<Regex> = LazyLock::new(|| {
Expand Down Expand Up @@ -52,7 +125,8 @@ static INLINE_TOKEN_RE: LazyLock<Regex> = LazyLock::new(|| {
/// A streaming markdown buffer that tracks open constructs.
///
/// Accumulates chunks and returns content that is safe to render,
/// holding back any incomplete markdown constructs.
/// holding back any incomplete markdown constructs. Large code blocks
/// are automatically truncated with full content saved to a temp file.
#[derive(Default)]
pub struct MarkdownBuffer {
buffer: String,
Expand Down Expand Up @@ -106,7 +180,8 @@ impl MarkdownBuffer {
/// Add a chunk of markdown text to the buffer.
///
/// Returns any content that is safe to render, or None if the buffer
/// contains only incomplete constructs.
/// contains only incomplete constructs. Large code blocks are automatically
/// truncated with full content saved to a temp file.
pub fn push(&mut self, chunk: &str) -> Option<String> {
self.buffer.push_str(chunk);
let safe_end = self.find_safe_end();
Expand All @@ -118,7 +193,7 @@ impl MarkdownBuffer {
// - The regex tokenizer operates on &str which guarantees UTF-8
let to_render = self.buffer[..safe_end].to_string();
self.buffer = self.buffer[safe_end..].to_string();
Some(to_render)
Some(truncate_code_blocks(&to_render))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Truncate buffered markdown during flush

The truncation logic is only applied in push, but remaining buffered text is rendered through flush paths (flush_markdown_buffer) without passing through truncate_code_blocks. If a stream ends while still inside a long fenced block (the existing unclosed-code-block flow), push emits nothing and flush prints the full block, bypassing the new 50-line output cap.

Useful? React with 👍 / 👎.

} else {
None
}
Expand Down
Loading