feat: add outputBuffering option for step log streaming by four-bytes-robby · Pull Request #2311 · dagucloud/dagu

four-bytes-robby · 2026-06-20T20:47:10Z

Summary

Adds an output_buffering option to both DAG and Step configurations that controls how step output is buffered before being flushed to the log stream.

Three modes:

Mode	Behavior	Use case
`buffer` (default)	Accumulates until 32KB threshold (gRPC) / bufio.Writer (local)	High-throughput, batch output
`line`	Flushes on every newline character	Interactive CLI tools, PHP, real-time log streaming
`none`	Flushes every Write() call immediately	Maximum real-time (high gRPC overhead)

Backward compatible

Default "buffer" preserves existing behavior for all existing DAGs — no migration needed.

Changes (9 files, +485/-35)

internal/core/output_buffering.go — new enum type + constants
internal/core/step.go — added OutputBuffering field to Step
internal/core/dag.go — added OutputBuffering to DAG + EffectiveOutputBuffering()
internal/runtime/context.go — context helpers for threading buffering mode
internal/runtime/output.go — wire mode into remote writer setup
internal/service/worker/coordreport/log_streamer.go — critical fix: Write() supports all three modes, new sendChunk() helper
internal/cmn/schema/dag.schema.json — added output_buffering property

Tests

13 new tests covering all three modes (buffer threshold, line flush, unbuffered immediate, Close behavior, error propagation)
All 42 existing test packages continue to pass

Root cause fix

The stepLogWriter previously only flushed at 32KB, causing:

Output delayed until step completion (success case)
Output silently lost on worker errors (failure case)

With output_buffering: "line", output streams in real-time. With output_buffering: "none", every Write() is immediately sent.

Summary by cubic

Adds output_buffering to DAGs and Steps to control how step logs flush in remote gRPC streams and local file writers, and preserves buffered output if the gRPC stream fails (implements #2310).

New Features
- output_buffering at DAG and Step (step overrides). Modes: buffer (default), line, none.
- Runtime passes the mode via context; remote streamers and local file writers respect it. Schema updated; default remains backward compatible.
Bug Fixes
- Flush and close output writers on all node exit paths (early failure, skip, panic).
- On gRPC stream failure, keep buffered data, log the last 4KB on Close, and surface the error; chunk splitting with sequence increments only after successful sends.

^{Written for commit a548189. Summary will update on new commits.}

Summary by CodeRabbit

New Features
- Added output buffering configuration to control how step stdout/stderr is flushed to logs. Three modes available: buffer (default), line (flush on newlines), and none (immediate flush). Configurable at DAG level with per-step overrides.

coderabbitai · 2026-06-20T20:47:18Z

📝 Walkthrough

Walkthrough

Introduces a three-mode outputBuffering setting (buffer, line, none) at DAG and step levels. Adds the OutputBuffering type, struct fields, JSON schema entries, a context propagation helper, new directWriter and lineBufferedWriter implementations, mode-aware local writer selection, and a reworked gRPC stepLogWriter that sends data per-write, per-line, or at the 32 KB threshold depending on the resolved mode.

Changes

Output Buffering Feature

Layer / File(s)	Summary
Core type, DAG/Step fields, and JSON schema `internal/core/output_buffering.go`, `internal/core/dag.go`, `internal/core/step.go`, `internal/cmn/schema/dag.schema.json`	Defines the `OutputBuffering` string type with `buffer`/`line`/`none` constants, adds the field to `DAG` and `Step` structs (with mapstructure/JSON tags), adds `EffectiveOutputBuffering` resolution helper (step → DAG → default), and registers DAG-level and step-level `output_buffering` properties in the JSON schema.
Context helpers and output setup wiring `internal/runtime/context.go`, `internal/runtime/output.go`	Adds `WithOutputBuffering` and `GetOutputBuffering` context helpers with a fallback default; wires `EffectiveOutputBuffering` into both `setupRemoteWriters` and `setupLocalWriters` so the resolved mode is stored in context and passed to `newWriterForMode` for writer selection.
`directWriter` and `lineBufferedWriter` implementations `internal/runtime/writer.go`	Adds `directWriter` (mutex-protected, `Flush` is a no-op) and `lineBufferedWriter` (mutex-protected, flushes complete lines on `'\n'`, remaining bytes on `Flush`).
gRPC `stepLogWriter` buffering modes and send/close rework `internal/service/worker/coordreport/log_streamer.go`	Updates `NewStepWriter` to read buffering mode from context and set `lineBuffered`/`unbuffered` flags; reworks `Write` to send immediately (`none`), flush on newlines (`line`), or buffer to threshold (`buffer`); updates `sendChunk` to mark stream dead on error and update sequence number only after success; updates `Close` to call `sendChunk` directly for remaining bytes.
Tests for all buffering paths `internal/runtime/output_test.go`, `internal/runtime/writer.go`, `internal/service/worker/coordreport/export_test.go`, `internal/service/worker/coordreport/log_streamer_test.go`	Adds tests for `newWriterForMode` type selection, `lineBufferedWriter` and `directWriter` semantics, `TestOutputCoordinator_SetupLocalWriters` for all modes, snapshot fields for black-box test access, and comprehensive `log_streamer_test.go` cases covering all three modes, error propagation, backward compatibility, and `Close` behavior.

Sequence Diagram(s)

sequenceDiagram
  participant StepExec as Step Executor
  participant EffOB as EffectiveOutputBuffering
  participant Ctx as context.Context
  participant newWFM as newWriterForMode
  participant StepWriter as stepLogWriter (gRPC)
  participant LocalWriter as local file writer

  StepExec->>EffOB: resolve(dag, step)
  EffOB-->>StepExec: mode (buffer|line|none)
  StepExec->>Ctx: WithOutputBuffering(ctx, mode)

  alt Remote (gRPC) path
    StepExec->>StepWriter: NewStepWriter(ctx, ...)
    StepWriter->>Ctx: GetOutputBuffering(ctx)
    Ctx-->>StepWriter: mode
    StepWriter->>StepWriter: set lineBuffered / unbuffered flags
    loop Write(p)
      StepWriter->>StepWriter: none → sendChunk immediately
      StepWriter->>StepWriter: line → scan newlines, sendChunk per line
      StepWriter->>StepWriter: buffer → accumulate, sendChunk at 32KB
    end
    StepWriter->>StepWriter: Close() → sendChunk remaining bytes
  end

  alt Local file path
    StepExec->>newWFM: newWriterForMode(fileWriter, mode)
    newWFM-->>LocalWriter: directWriter | lineBufferedWriter | safeBufferedWriter
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'feat: add outputBuffering option for step log streaming' accurately and concisely describes the main change—adding an outputBuffering configuration option to control step log streaming behavior.
Linked Issues check	✅ Passed	The PR fully implements all requirements from issue `#2310`: adds outputBuffering field to DAG and Step; implements three modes (buffer, line, none); maintains backward compatibility; wires mode through both remote and local writers; includes comprehensive tests.
Out of Scope Changes check	✅ Passed	All changes directly support the outputBuffering feature from issue `#2310`. The modifications to schema, core types, runtime context, writer implementations, and tests are all within scope of implementing the buffering modes.
Description check	✅ Passed	The PR description provides a clear summary, comprehensive change breakdown, implementation details, backward compatibility assurance, and test coverage information.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

four-bytes-robby · 2026-06-20T22:41:52Z

@coderabbitai review

coderabbitai · 2026-06-20T22:42:00Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

internal/runtime/output.go (1)
382-383: ⚡ Quick win

Remove the no-op context write in local writer setup.

Line 383 discards the returned context, so this call has no effect and suggests propagation that never happens.
Suggested fix
-	// Store mode in context for consistency with setupRemoteWriters
-	_ = WithOutputBuffering(ctx, mode)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/runtime/output.go` around lines 382 - 383, The call to
WithOutputBuffering(ctx, mode) on line 383 discards the returned context by
assigning it to the blank identifier, making this a no-op that has no actual
effect. Remove this entire line since the context propagation it appears to
perform never actually happens - either the operation is unnecessary for local
writer setup or the returned context should be captured and used in the
function's subsequent logic.
internal/runtime/writer.go (1)
127-135: ⚡ Quick win

Consider using chunked newline detection instead of byte-by-byte iteration.

The byte-by-byte loop with individual appends can trigger many reallocations for large writes. The stepLogWriter.Write implementation (log_streamer.go:232-244) demonstrates a more efficient pattern using bytes.IndexByte to find newlines in chunks, then slicing the buffer in bulk.

For typical line-buffered use cases (CLI output, logs), this may not be a bottleneck, but the chunked approach would scale better for high-throughput scenarios.
⚡ Example refactor using chunked detection
 func (lw *lineBufferedWriter) Write(p []byte) (int, error) {
 	lw.mu.Lock()
 	defer lw.mu.Unlock()
 
-	for _, b := range p {
-		lw.buf = append(lw.buf, b)
-		if b == '\n' {
+	lw.buf = append(lw.buf, p...)
+	for {
+		idx := bytes.IndexByte(lw.buf, '\n')
+		if idx < 0 {
+			break
+		}
+		line := lw.buf[:idx+1]
+		lw.buf = lw.buf[idx+1:]
+		if _, err := lw.w.Write(line); err != nil {
+			return len(p), err
-			if _, err := lw.w.Write(lw.buf); err != nil {
-				return 0, err
-			}
-			lw.buf = lw.buf[:0]
 		}
 	}
 	return len(p), nil
 }
This also fixes the error-return issue mentioned in the previous comment.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/runtime/writer.go` around lines 127 - 135, The byte-by-byte
iteration in the loop over p with individual appends to lw.buf causes many
buffer reallocations for large writes. Refactor the loop to use bytes.IndexByte
to find newlines in chunks rather than checking each byte individually, then
slice and append the buffer content in bulk segments between newlines. Follow
the pattern demonstrated in the stepLogWriter.Write implementation (referenced
at log_streamer.go lines 232-244) which shows how to efficiently handle chunked
newline detection and bulk buffer management instead of per-byte operations.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/cmn/schema/dag.schema.json`:
- Line 1768: The description text on line 1768 references "outputBuffering" in
camelCase, but the actual schema field name uses snake_case as
"output_buffering". Update the description string to use the correct field name
"output_buffering" instead of "outputBuffering" to match the actual schema field
and prevent confusion for config authors reading the schema documentation.

In `@internal/runtime/writer.go`:
- Around line 130-131: The error handling in the Write method violates the
io.Writer contract by returning (0, err) when the underlying write operation
fails. Since all input bytes have already been consumed into the buffer before
the flush attempt, the function should return (len(p), err) instead of (0, err)
when lw.w.Write(lw.buf) fails at line 130. This indicates to the caller that all
input bytes were accepted into the buffer, even though the flush operation
failed, consistent with the pattern used in stepLogWriter.Write from
log_streamer.go.

---

Nitpick comments:
In `@internal/runtime/output.go`:
- Around line 382-383: The call to WithOutputBuffering(ctx, mode) on line 383
discards the returned context by assigning it to the blank identifier, making
this a no-op that has no actual effect. Remove this entire line since the
context propagation it appears to perform never actually happens - either the
operation is unnecessary for local writer setup or the returned context should
be captured and used in the function's subsequent logic.

In `@internal/runtime/writer.go`:
- Around line 127-135: The byte-by-byte iteration in the loop over p with
individual appends to lw.buf causes many buffer reallocations for large writes.
Refactor the loop to use bytes.IndexByte to find newlines in chunks rather than
checking each byte individually, then slice and append the buffer content in
bulk segments between newlines. Follow the pattern demonstrated in the
stepLogWriter.Write implementation (referenced at log_streamer.go lines 232-244)
which shows how to efficiently handle chunked newline detection and bulk buffer
management instead of per-byte operations.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c398dfa3-425d-4d4f-8ee1-80390a88d8d8

📥 Commits

Reviewing files that changed from the base of the PR and between 9b6a7ff and ce1513b.

📒 Files selected for processing (11)

internal/cmn/schema/dag.schema.json
internal/core/dag.go
internal/core/output_buffering.go
internal/core/step.go
internal/runtime/context.go
internal/runtime/output.go
internal/runtime/output_test.go
internal/runtime/writer.go
internal/service/worker/coordreport/export_test.go
internal/service/worker/coordreport/log_streamer.go
internal/service/worker/coordreport/log_streamer_test.go

…ract, no-op context dagucloud#2310

four-bytes-robby · 2026-06-20T22:59:51Z

@coderabbitai Fixed in ddca497.

coderabbitai · 2026-06-20T23:00:28Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check what commit ddca497 contains
gh pr view 2311 --repo dagucloud/dagu --json commits | jq '.commits[] | select(.oid | startswith("ddca497"))'

Length of output: 653

🏁 Script executed:

#!/bin/bash
# Get recent review comments from CodeRabbit on this PR
gh api repos/dagucloud/dagu/pulls/2311/comments --jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, path: .path, body: .body[:200], line: .line}'

Length of output: 844

@four-bytes-robby: Thanks for the fix! Commit ddca497 looks like it addresses all three issues — the schema description key name (output_buffering), the io.Writer contract violation in writer.go, and the no-op context. Appreciate the quick turnaround! 🎉

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

four-bytes-robby · 2026-06-20T23:00:45Z

@coderabbitai Fixed in ddca497. (reply to comment on internal/cmn/schema/dag.schema.json:3447615306)

four-bytes-robby · 2026-06-20T23:00:45Z

@coderabbitai Fixed in ddca497. (reply to comment on internal/runtime/writer.go:3447615308)

coderabbitai · 2026-06-20T23:01:01Z

Tip

For best results, initiate chat on the files or code changes.

@four-bytes-robby: Thanks for confirming! The io.Writer contract fix in internal/runtime/writer.go from commit ddca497 looks good. 🎉

coderabbitai · 2026-06-20T23:01:02Z

Tip

For best results, initiate chat on the files or code changes.

@four-bytes-robby: Thanks for fixing the schema description key name for output_buffering in commit ddca497! 🎉

four-bytes-robby · 2026-06-20T23:10:19Z

@coderabbitai All nitpicks from review #4538924054 fixed:

schema.json outputBuffering→output_buffering description: ddca497
writer.go io.Writer contract (0, err)→(len(p), err): ddca497
output.go no-op context write removed: ddca497
writer.go byte-by-byte→chunked newline detection: 5eb34eb

coderabbitai · 2026-06-20T23:10:33Z

Tip

For best results, initiate chat on the files or code changes.

@four-bytes-robby: All four fixes confirmed — great work! The chunked newline detection in 5eb34eb is a nice improvement over byte-by-byte scanning, as it reduces scanning overhead for larger writes. The PR looks solid now. 🎉

four-bytes-robby · 2026-06-21T07:31:40Z

@yohamta0 Let's first keep this running a bit in production to evaluate that there are no further regressions. I added a fix for the finish / teardown moment of a dag run (e.g. error occurs) to flush the buffer even with 32k window.

feat: add outputBuffering option for step log streaming dagucloud#2310

b3d8af3

fix: wire outputBuffering through local writer path dagucloud#2310

ce1513b

coderabbitai Bot reviewed Jun 20, 2026

View reviewed changes

Comment thread internal/cmn/schema/dag.schema.json Outdated

Comment thread internal/runtime/writer.go Outdated

fix: address coderabbit nitpicks — schema description, io.Writer cont…

ddca497

…ract, no-op context dagucloud#2310

perf: use chunked newline detection in lineBufferedWriter dagucloud#2310

5eb34eb

yohamta0 marked this pull request as ready for review June 21, 2026 06:56

four-bytes-robby added 2 commits June 21, 2026 09:34

fix: flush output writers on all node exit paths dagucloud#2310

395b4ee

fix: retain buffered output on gRPC stream failure dagucloud#2310

a548189

Uh oh!

Conversation

four-bytes-robby commented Jun 20, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Three modes:

Backward compatible

Changes (9 files, +485/-35)

Tests

Root cause fix

Summary by cubic

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

four-bytes-robby commented Jun 20, 2026

Uh oh!

coderabbitai Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

four-bytes-robby commented Jun 20, 2026

Uh oh!

coderabbitai Bot commented Jun 20, 2026

Uh oh!

four-bytes-robby commented Jun 20, 2026

Uh oh!

four-bytes-robby commented Jun 20, 2026

Uh oh!

coderabbitai Bot commented Jun 20, 2026

Uh oh!

coderabbitai Bot commented Jun 20, 2026

Uh oh!

four-bytes-robby commented Jun 20, 2026

Uh oh!

coderabbitai Bot commented Jun 20, 2026

Uh oh!

four-bytes-robby commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

four-bytes-robby commented Jun 20, 2026 •

edited by cubic-dev-ai Bot

Loading

coderabbitai Bot commented Jun 20, 2026 •

edited

Loading

coderabbitai Bot commented Jun 20, 2026 •

edited

Loading