Skip to content

feat: add outputBuffering option for step log streaming#2311

Open
four-bytes-robby wants to merge 6 commits into
dagucloud:mainfrom
four-flames:feat/2310-output-buffering-mode
Open

feat: add outputBuffering option for step log streaming#2311
four-bytes-robby wants to merge 6 commits into
dagucloud:mainfrom
four-flames:feat/2310-output-buffering-mode

Conversation

@four-bytes-robby

@four-bytes-robby four-bytes-robby commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Closes #2310

Summary

Adds an output_buffering option to both DAG and Step configurations that controls how step output is buffered before being flushed to the log stream.

Three modes:

Mode Behavior Use case
buffer (default) Accumulates until 32KB threshold (gRPC) / bufio.Writer (local) High-throughput, batch output
line Flushes on every newline character Interactive CLI tools, PHP, real-time log streaming
none Flushes every Write() call immediately Maximum real-time (high gRPC overhead)

Backward compatible

Default "buffer" preserves existing behavior for all existing DAGs — no migration needed.

Changes (9 files, +485/-35)

  • internal/core/output_buffering.go — new enum type + constants
  • internal/core/step.go — added OutputBuffering field to Step
  • internal/core/dag.go — added OutputBuffering to DAG + EffectiveOutputBuffering()
  • internal/runtime/context.go — context helpers for threading buffering mode
  • internal/runtime/output.go — wire mode into remote writer setup
  • internal/service/worker/coordreport/log_streamer.gocritical fix: Write() supports all three modes, new sendChunk() helper
  • internal/cmn/schema/dag.schema.json — added output_buffering property

Tests

  • 13 new tests covering all three modes (buffer threshold, line flush, unbuffered immediate, Close behavior, error propagation)
  • All 42 existing test packages continue to pass

Root cause fix

The stepLogWriter previously only flushed at 32KB, causing:

  • Output delayed until step completion (success case)
  • Output silently lost on worker errors (failure case)

With output_buffering: "line", output streams in real-time. With output_buffering: "none", every Write() is immediately sent.


Summary by cubic

Adds output_buffering to DAGs and Steps to control how step logs flush in remote gRPC streams and local file writers, and preserves buffered output if the gRPC stream fails (implements #2310).

  • New Features

    • output_buffering at DAG and Step (step overrides). Modes: buffer (default), line, none.
    • Runtime passes the mode via context; remote streamers and local file writers respect it. Schema updated; default remains backward compatible.
  • Bug Fixes

    • Flush and close output writers on all node exit paths (early failure, skip, panic).
    • On gRPC stream failure, keep buffered data, log the last 4KB on Close, and surface the error; chunk splitting with sequence increments only after successful sends.

Written for commit a548189. Summary will update on new commits.

Review in cubic

Summary by CodeRabbit

  • New Features
    • Added output buffering configuration to control how step stdout/stderr is flushed to logs. Three modes available: buffer (default), line (flush on newlines), and none (immediate flush). Configurable at DAG level with per-step overrides.

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Introduces a three-mode outputBuffering setting (buffer, line, none) at DAG and step levels. Adds the OutputBuffering type, struct fields, JSON schema entries, a context propagation helper, new directWriter and lineBufferedWriter implementations, mode-aware local writer selection, and a reworked gRPC stepLogWriter that sends data per-write, per-line, or at the 32 KB threshold depending on the resolved mode.

Changes

Output Buffering Feature

Layer / File(s) Summary
Core type, DAG/Step fields, and JSON schema
internal/core/output_buffering.go, internal/core/dag.go, internal/core/step.go, internal/cmn/schema/dag.schema.json
Defines the OutputBuffering string type with buffer/line/none constants, adds the field to DAG and Step structs (with mapstructure/JSON tags), adds EffectiveOutputBuffering resolution helper (step → DAG → default), and registers DAG-level and step-level output_buffering properties in the JSON schema.
Context helpers and output setup wiring
internal/runtime/context.go, internal/runtime/output.go
Adds WithOutputBuffering and GetOutputBuffering context helpers with a fallback default; wires EffectiveOutputBuffering into both setupRemoteWriters and setupLocalWriters so the resolved mode is stored in context and passed to newWriterForMode for writer selection.
directWriter and lineBufferedWriter implementations
internal/runtime/writer.go
Adds directWriter (mutex-protected, Flush is a no-op) and lineBufferedWriter (mutex-protected, flushes complete lines on '\n', remaining bytes on Flush).
gRPC stepLogWriter buffering modes and send/close rework
internal/service/worker/coordreport/log_streamer.go
Updates NewStepWriter to read buffering mode from context and set lineBuffered/unbuffered flags; reworks Write to send immediately (none), flush on newlines (line), or buffer to threshold (buffer); updates sendChunk to mark stream dead on error and update sequence number only after success; updates Close to call sendChunk directly for remaining bytes.
Tests for all buffering paths
internal/runtime/output_test.go, internal/runtime/writer.go, internal/service/worker/coordreport/export_test.go, internal/service/worker/coordreport/log_streamer_test.go
Adds tests for newWriterForMode type selection, lineBufferedWriter and directWriter semantics, TestOutputCoordinator_SetupLocalWriters for all modes, snapshot fields for black-box test access, and comprehensive log_streamer_test.go cases covering all three modes, error propagation, backward compatibility, and Close behavior.

Sequence Diagram(s)

sequenceDiagram
  participant StepExec as Step Executor
  participant EffOB as EffectiveOutputBuffering
  participant Ctx as context.Context
  participant newWFM as newWriterForMode
  participant StepWriter as stepLogWriter (gRPC)
  participant LocalWriter as local file writer

  StepExec->>EffOB: resolve(dag, step)
  EffOB-->>StepExec: mode (buffer|line|none)
  StepExec->>Ctx: WithOutputBuffering(ctx, mode)

  alt Remote (gRPC) path
    StepExec->>StepWriter: NewStepWriter(ctx, ...)
    StepWriter->>Ctx: GetOutputBuffering(ctx)
    Ctx-->>StepWriter: mode
    StepWriter->>StepWriter: set lineBuffered / unbuffered flags
    loop Write(p)
      StepWriter->>StepWriter: none → sendChunk immediately
      StepWriter->>StepWriter: line → scan newlines, sendChunk per line
      StepWriter->>StepWriter: buffer → accumulate, sendChunk at 32KB
    end
    StepWriter->>StepWriter: Close() → sendChunk remaining bytes
  end

  alt Local file path
    StepExec->>newWFM: newWriterForMode(fileWriter, mode)
    newWFM-->>LocalWriter: directWriter | lineBufferedWriter | safeBufferedWriter
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'feat: add outputBuffering option for step log streaming' accurately and concisely describes the main change—adding an outputBuffering configuration option to control step log streaming behavior.
Linked Issues check ✅ Passed The PR fully implements all requirements from issue #2310: adds outputBuffering field to DAG and Step; implements three modes (buffer, line, none); maintains backward compatibility; wires mode through both remote and local writers; includes comprehensive tests.
Out of Scope Changes check ✅ Passed All changes directly support the outputBuffering feature from issue #2310. The modifications to schema, core types, runtime context, writer implementations, and tests are all within scope of implementing the buffering modes.
Description check ✅ Passed The PR description provides a clear summary, comprehensive change breakdown, implementation details, backward compatibility assurance, and test coverage information.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@four-bytes-robby

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
internal/runtime/output.go (1)

382-383: ⚡ Quick win

Remove the no-op context write in local writer setup.

Line 383 discards the returned context, so this call has no effect and suggests propagation that never happens.

Suggested fix
-	// Store mode in context for consistency with setupRemoteWriters
-	_ = WithOutputBuffering(ctx, mode)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/runtime/output.go` around lines 382 - 383, The call to
WithOutputBuffering(ctx, mode) on line 383 discards the returned context by
assigning it to the blank identifier, making this a no-op that has no actual
effect. Remove this entire line since the context propagation it appears to
perform never actually happens - either the operation is unnecessary for local
writer setup or the returned context should be captured and used in the
function's subsequent logic.
internal/runtime/writer.go (1)

127-135: ⚡ Quick win

Consider using chunked newline detection instead of byte-by-byte iteration.

The byte-by-byte loop with individual appends can trigger many reallocations for large writes. The stepLogWriter.Write implementation (log_streamer.go:232-244) demonstrates a more efficient pattern using bytes.IndexByte to find newlines in chunks, then slicing the buffer in bulk.

For typical line-buffered use cases (CLI output, logs), this may not be a bottleneck, but the chunked approach would scale better for high-throughput scenarios.

⚡ Example refactor using chunked detection
 func (lw *lineBufferedWriter) Write(p []byte) (int, error) {
 	lw.mu.Lock()
 	defer lw.mu.Unlock()
 
-	for _, b := range p {
-		lw.buf = append(lw.buf, b)
-		if b == '\n' {
+	lw.buf = append(lw.buf, p...)
+	for {
+		idx := bytes.IndexByte(lw.buf, '\n')
+		if idx < 0 {
+			break
+		}
+		line := lw.buf[:idx+1]
+		lw.buf = lw.buf[idx+1:]
+		if _, err := lw.w.Write(line); err != nil {
+			return len(p), err
-			if _, err := lw.w.Write(lw.buf); err != nil {
-				return 0, err
-			}
-			lw.buf = lw.buf[:0]
 		}
 	}
 	return len(p), nil
 }

This also fixes the error-return issue mentioned in the previous comment.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/runtime/writer.go` around lines 127 - 135, The byte-by-byte
iteration in the loop over p with individual appends to lw.buf causes many
buffer reallocations for large writes. Refactor the loop to use bytes.IndexByte
to find newlines in chunks rather than checking each byte individually, then
slice and append the buffer content in bulk segments between newlines. Follow
the pattern demonstrated in the stepLogWriter.Write implementation (referenced
at log_streamer.go lines 232-244) which shows how to efficiently handle chunked
newline detection and bulk buffer management instead of per-byte operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/cmn/schema/dag.schema.json`:
- Line 1768: The description text on line 1768 references "outputBuffering" in
camelCase, but the actual schema field name uses snake_case as
"output_buffering". Update the description string to use the correct field name
"output_buffering" instead of "outputBuffering" to match the actual schema field
and prevent confusion for config authors reading the schema documentation.

In `@internal/runtime/writer.go`:
- Around line 130-131: The error handling in the Write method violates the
io.Writer contract by returning (0, err) when the underlying write operation
fails. Since all input bytes have already been consumed into the buffer before
the flush attempt, the function should return (len(p), err) instead of (0, err)
when lw.w.Write(lw.buf) fails at line 130. This indicates to the caller that all
input bytes were accepted into the buffer, even though the flush operation
failed, consistent with the pattern used in stepLogWriter.Write from
log_streamer.go.

---

Nitpick comments:
In `@internal/runtime/output.go`:
- Around line 382-383: The call to WithOutputBuffering(ctx, mode) on line 383
discards the returned context by assigning it to the blank identifier, making
this a no-op that has no actual effect. Remove this entire line since the
context propagation it appears to perform never actually happens - either the
operation is unnecessary for local writer setup or the returned context should
be captured and used in the function's subsequent logic.

In `@internal/runtime/writer.go`:
- Around line 127-135: The byte-by-byte iteration in the loop over p with
individual appends to lw.buf causes many buffer reallocations for large writes.
Refactor the loop to use bytes.IndexByte to find newlines in chunks rather than
checking each byte individually, then slice and append the buffer content in
bulk segments between newlines. Follow the pattern demonstrated in the
stepLogWriter.Write implementation (referenced at log_streamer.go lines 232-244)
which shows how to efficiently handle chunked newline detection and bulk buffer
management instead of per-byte operations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c398dfa3-425d-4d4f-8ee1-80390a88d8d8

📥 Commits

Reviewing files that changed from the base of the PR and between 9b6a7ff and ce1513b.

📒 Files selected for processing (11)
  • internal/cmn/schema/dag.schema.json
  • internal/core/dag.go
  • internal/core/output_buffering.go
  • internal/core/step.go
  • internal/runtime/context.go
  • internal/runtime/output.go
  • internal/runtime/output_test.go
  • internal/runtime/writer.go
  • internal/service/worker/coordreport/export_test.go
  • internal/service/worker/coordreport/log_streamer.go
  • internal/service/worker/coordreport/log_streamer_test.go

Comment thread internal/cmn/schema/dag.schema.json Outdated
Comment thread internal/runtime/writer.go Outdated
@four-bytes-robby

Copy link
Copy Markdown
Contributor Author

@coderabbitai Fixed in ddca497.

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check what commit ddca497 contains
gh pr view 2311 --repo dagucloud/dagu --json commits | jq '.commits[] | select(.oid | startswith("ddca497"))'

Length of output: 653


🏁 Script executed:

#!/bin/bash
# Get recent review comments from CodeRabbit on this PR
gh api repos/dagucloud/dagu/pulls/2311/comments --jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, path: .path, body: .body[:200], line: .line}'

Length of output: 844



@four-bytes-robby: Thanks for the fix! Commit ddca497 looks like it addresses all three issues — the schema description key name (output_buffering), the io.Writer contract violation in writer.go, and the no-op context. Appreciate the quick turnaround! 🎉


If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

@four-bytes-robby

Copy link
Copy Markdown
Contributor Author

@coderabbitai Fixed in ddca497. (reply to comment on internal/cmn/schema/dag.schema.json:3447615306)

@four-bytes-robby

Copy link
Copy Markdown
Contributor Author

@coderabbitai Fixed in ddca497. (reply to comment on internal/runtime/writer.go:3447615308)

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

@four-bytes-robby: Thanks for confirming! The io.Writer contract fix in internal/runtime/writer.go from commit ddca497 looks good. 🎉

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

@four-bytes-robby: Thanks for fixing the schema description key name for output_buffering in commit ddca497! 🎉

@four-bytes-robby

Copy link
Copy Markdown
Contributor Author

@coderabbitai All nitpicks from review #4538924054 fixed:

  • schema.json outputBufferingoutput_buffering description: ddca497
  • writer.go io.Writer contract (0, err)(len(p), err): ddca497
  • output.go no-op context write removed: ddca497
  • writer.go byte-by-byte→chunked newline detection: 5eb34eb

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

@four-bytes-robby: All four fixes confirmed — great work! The chunked newline detection in 5eb34eb is a nice improvement over byte-by-byte scanning, as it reduces scanning overhead for larger writes. The PR looks solid now. 🎉

@yohamta0 yohamta0 marked this pull request as ready for review June 21, 2026 06:56
@four-bytes-robby

Copy link
Copy Markdown
Contributor Author

@yohamta0 Let's first keep this running a bit in production to evaluate that there are no further regressions. I added a fix for the finish / teardown moment of a dag run (e.g. error occurs) to flush the buffer even with 32k window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add outputBuffering option for step log streaming

1 participant