Skip to content

Conversation

@yottahmd
Copy link
Collaborator

@yottahmd yottahmd commented Jan 13, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added shared-nothing worker mode for distributed deployments.
    • Improved remote progress display with terminal width awareness and worker ID truncation.
  • Bug Fixes

    • Enhanced error handling for lost coordinator connections with consecutive error tracking.
    • Added file sync for improved data durability in status writes.
  • Configuration

    • Added configurable retry behavior for coordinator connections (MaxRetries and RetryInterval with exponential backoff).
  • Style

    • Removed excess padding from table containers in DAG details UI.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 13, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

The PR implements shared-nothing worker mode, refactors the coordinator handler API from functional options to struct-based config, extends retry configuration for coordinator connections, improves remote progress display with terminal width handling, adds file sync durability to DAG writes, and includes error tracking for sub-DAG polling with consecutive failure thresholds.

Changes

Cohort / File(s) Summary
Shared-nothing worker implementation
internal/cmd/context.go
Introduces isSharedNothingWorker() detector and early-return context for workers with static coordinators; skips local store initialization (DAGRunStore, ProcStore, QueueStore) and enables coordinator retry config application.
Coordinator handler API refactoring
internal/cmd/coord.go, internal/service/coordinator/handler.go, internal/service/coordinator/handler_test.go, internal/test/coordinator.go
Migrates from variadic HandlerOption pattern to structured HandlerConfig with fields (DAGRunStore, LogDir, StaleHeartbeatThreshold); updates all NewHandler() call sites and adds initial status writing for attempt creation.
Configuration schema extensions
internal/cmn/config/config.go, internal/cmn/config/definition.go, internal/cmn/config/loader.go
Adds MaxRetries and RetryInterval fields to Peer config; introduces loadPeerConfig() helper for parsing and constructing Peer configuration from PeerDef.
Worker remote handling
internal/service/worker/remote_handler.go, internal/service/worker/handler.go
Removes local DAGs directory loading from remote handler; simplifies original target logging; aligns with shared-nothing worker receiving DAG definitions from coordinator.
Remote progress display improvements
internal/cmd/progress_remote.go
Adds terminal width awareness to RemoteProgressDisplay; implements truncateWorkerDisplay() and truncateWorkerID() helpers for width-aware header and final-line rendering with ANSI sequences.
Data persistence and error tracking
internal/persis/filedagrun/writer.go, internal/runtime/executor/dag_runner.go
Adds file.Sync() after JSON writes for durability; introduces max consecutive error threshold (10) with counter for sub-DAG status polling, returns error on connection loss after retries.
Test infrastructure updates
internal/service/worker/worker_test.go
Adds Definition field (YAML string) to Task proto-generated structs across multiple test cases; updates task literals in dispatch, comparison, and mock response scenarios.
UI styling adjustments
ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx, DAGStepTable.tsx, NodeStatusTable.tsx
Removes p-px padding utility from container classNames in multiple components; adds overflow-x-hidden to DAGDetailsPanel main content container.

Sequence Diagram(s)

sequenceDiagram
    participant Worker
    participant Context
    participant CoordinatorClient
    participant Coordinator
    participant TaskPayload

    Worker->>Context: NewContext(cmd, config)
    Context->>Context: isSharedNothingWorker()?
    alt Shared-nothing Mode
        Context->>Context: Skip local store init
        Context->>Context: Apply retry config
        Context-->>Worker: Return context (nil stores)
        Worker->>TaskPayload: Extract DAG definition
        Worker->>CoordinatorClient: Send status updates
        CoordinatorClient->>Coordinator: Push status & logs
    else Standard Mode
        Context->>Context: Initialize all local stores
        Context-->>Worker: Return context (ready stores)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related PRs

  • feat: shared-nothing worker #1564: Directly implements the same shared-nothing worker feature with coordinator-related code path changes (context detection, handler config API, retry fields, status streaming).
  • feat(core): waiting status #1554: Refactors coordinator handler API from functional options to HandlerConfig struct, which mirrors the same API migration pattern in this PR.
  • worker: fix problems in shared nothing worker #1573: Covers overlapping shared-nothing worker and coordinator changes including context early-return logic, handler config updates, and remote progress handling.
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the PR, which includes adding shared-nothing worker mode support, configurable retry behavior for coordinator connections, error tracking for status polling, and various stability improvements across distributed execution components.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
internal/cmn/config/loader.go (1)

293-311: Consider logging a warning on RetryInterval parse failure for consistency.

Other duration fields in this file (e.g., LockStaleThreshold at lines 740-744) log warnings when parsing fails. The silent failure here is inconsistent and could make configuration issues harder to diagnose.

♻️ Suggested fix
 	if def.RetryInterval != "" {
 		if d, err := time.ParseDuration(def.RetryInterval); err == nil {
 			peer.RetryInterval = d
+		} else {
+			l.warnings = append(l.warnings, fmt.Sprintf("Invalid peer.retryInterval value: %s", def.RetryInterval))
 		}
 	}
internal/cmd/progress_remote.go (1)

251-263: Edge case: truncation may cut into the arrow prefix.

When availableWidth is small (e.g., 5-7), the truncation at line 260 could cut into the " → " prefix, resulting in malformed output like " →…" or " …".

Consider ensuring minimum width accounts for the arrow:

♻️ Suggested improvement
 func (p *RemoteProgressDisplay) truncateWorkerID(availableWidth int) string {
-	if p.workerID == "" || availableWidth <= 5 {
+	// Arrow " → " is 3 chars + at least 1 char of ID + ellipsis = 5 minimum
+	const arrowLen = 4 // " → " with space
+	if p.workerID == "" || availableWidth < arrowLen+2 {
 		return ""
 	}
 
 	workerSuffix := " → " + p.workerID
 	if len(workerSuffix) > availableWidth {
-		return workerSuffix[:availableWidth-1] + "…"
+		// Ensure we keep the arrow and truncate only the worker ID part
+		maxIDLen := availableWidth - arrowLen - 1 // -1 for ellipsis
+		if maxIDLen <= 0 {
+			return ""
+		}
+		return " → " + p.workerID[:maxIDLen] + "…"
 	}
 	return workerSuffix
 }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 732a891 and 71aa451.

📒 Files selected for processing (17)
  • internal/cmd/context.go
  • internal/cmd/coord.go
  • internal/cmd/progress_remote.go
  • internal/cmn/config/config.go
  • internal/cmn/config/definition.go
  • internal/cmn/config/loader.go
  • internal/persis/filedagrun/writer.go
  • internal/runtime/executor/dag_runner.go
  • internal/service/coordinator/handler.go
  • internal/service/coordinator/handler_test.go
  • internal/service/worker/handler.go
  • internal/service/worker/remote_handler.go
  • internal/service/worker/worker_test.go
  • internal/test/coordinator.go
  • ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx
  • ui/src/features/dags/components/dag-details/DAGStepTable.tsx
  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
🧰 Additional context used
📓 Path-based instructions (5)
ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

ui/**/*.{ts,tsx}: The React + TypeScript frontend resides in ui/, with production bundles copied to internal/service/frontend/assets by make ui
UI code follows ESLint + Prettier (2-space indent) and Tailwind utilities; name React components in PascalCase (JobList.tsx) and hooks with use* (useJobs.ts)

Files:

  • ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx
  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
  • ui/src/features/dags/components/dag-details/DAGStepTable.tsx
ui/**/*.{tsx,ts,jsx,js}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

Avoid full-page loading overlays and LoadingIndicator components that hide content - show stale data while fetching updates instead

Files:

  • ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx
  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
  • ui/src/features/dags/components/dag-details/DAGStepTable.tsx
ui/**/*.{tsx,jsx}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

ui/**/*.{tsx,jsx}: Keep modal headers small and information-dense with minimal padding (e.g., p-2 or p-3 instead of p-4 or p-6)
Use compact form element heights: select boxes with h-7 or smaller, buttons with h-7 or h-8, inputs with compact padding (py-0.5 or py-1)
Minimize table and list row heights while maintaining readability, merge related columns to save space, and handle long text with whitespace-normal break-words
Use consistent metadata styling with uniform backgrounds (e.g., bg-slate-200 dark:bg-slate-700) and text hierarchy using size/weight over color variation
Use flexbox-first layouts with min-h-0 and overflow-hidden to prevent layout breaks, and account for fixed elements when setting heights
Support keyboard navigation in all interactive components including modals with arrow keys, enter, and escape keys
Avoid auto-focusing first items in modals unless it makes sense for the specific use case
Maintain sufficient color contrast in both light and dark modes, use proper ARIA labels, and ensure text remains readable at smaller sizes
Use transparent backgrounds in navigation elements and keep navigation components small and unobtrusive
Avoid two-line displays for single metadata items, excessive whitespace between elements, decorative elements without purpose, and modals that take up excessive screen space

Files:

  • ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx
  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
  • ui/src/features/dags/components/dag-details/DAGStepTable.tsx
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/runtime/executor/dag_runner.go
  • internal/cmn/config/config.go
  • internal/service/worker/remote_handler.go
  • internal/service/worker/handler.go
  • internal/cmn/config/loader.go
  • internal/persis/filedagrun/writer.go
  • internal/service/worker/worker_test.go
  • internal/test/coordinator.go
  • internal/cmd/coord.go
  • internal/cmn/config/definition.go
  • internal/cmd/context.go
  • internal/service/coordinator/handler_test.go
  • internal/cmd/progress_remote.go
  • internal/service/coordinator/handler.go
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Co-locate Go tests as *_test.go; favour table-driven cases and cover failure paths
Use stretchr/testify/require and shared fixtures from internal/test instead of duplicating mocks

Files:

  • internal/service/worker/worker_test.go
  • internal/service/coordinator/handler_test.go
🧠 Learnings (8)
📚 Learning: 2026-01-12T16:59:14.740Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2026-01-12T16:59:14.740Z
Learning: Applies to ui/**/*.{tsx,jsx} : Use flexbox-first layouts with `min-h-0` and `overflow-hidden` to prevent layout breaks, and account for fixed elements when setting heights

Applied to files:

  • ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx
  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
📚 Learning: 2026-01-12T16:59:14.740Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2026-01-12T16:59:14.740Z
Learning: Applies to ui/**/*.{tsx,jsx} : Minimize table and list row heights while maintaining readability, merge related columns to save space, and handle long text with `whitespace-normal break-words`

Applied to files:

  • ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx
  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
  • ui/src/features/dags/components/dag-details/DAGStepTable.tsx
📚 Learning: 2026-01-12T16:59:14.740Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2026-01-12T16:59:14.740Z
Learning: Applies to ui/**/*.{tsx,jsx} : Avoid two-line displays for single metadata items, excessive whitespace between elements, decorative elements without purpose, and modals that take up excessive screen space

Applied to files:

  • ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx
  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
  • ui/src/features/dags/components/dag-details/DAGStepTable.tsx
📚 Learning: 2026-01-12T16:59:14.740Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2026-01-12T16:59:14.740Z
Learning: Applies to ui/**/*.{tsx,jsx} : Use transparent backgrounds in navigation elements and keep navigation components small and unobtrusive

Applied to files:

  • ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx
  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
📚 Learning: 2026-01-12T16:59:14.740Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2026-01-12T16:59:14.740Z
Learning: Applies to ui/**/*.{tsx,ts,jsx,js} : Avoid full-page loading overlays and LoadingIndicator components that hide content - show stale data while fetching updates instead

Applied to files:

  • ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx
📚 Learning: 2026-01-12T16:59:14.740Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2026-01-12T16:59:14.740Z
Learning: Applies to ui/**/*.{tsx,jsx} : Keep modal headers small and information-dense with minimal padding (e.g., `p-2` or `p-3` instead of `p-4` or `p-6`)

Applied to files:

  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
  • ui/src/features/dags/components/dag-details/DAGStepTable.tsx
📚 Learning: 2026-01-12T16:59:14.740Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2026-01-12T16:59:14.740Z
Learning: Applies to ui/**/*.{tsx,jsx} : Use consistent metadata styling with uniform backgrounds (e.g., `bg-slate-200 dark:bg-slate-700`) and text hierarchy using size/weight over color variation

Applied to files:

  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
  • ui/src/features/dags/components/dag-details/DAGStepTable.tsx
📚 Learning: 2026-01-12T16:59:14.740Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2026-01-12T16:59:14.740Z
Learning: Applies to ui/**/*.{tsx,jsx} : Use compact form element heights: select boxes with `h-7` or smaller, buttons with `h-7` or `h-8`, inputs with compact padding (`py-0.5` or `py-1`)

Applied to files:

  • ui/src/features/dags/components/dag-details/NodeStatusTable.tsx
  • ui/src/features/dags/components/dag-details/DAGStepTable.tsx
🧬 Code graph analysis (7)
internal/runtime/executor/dag_runner.go (1)
api/v2/api.gen.go (1)
  • Error (682-691)
internal/service/worker/remote_handler.go (1)
internal/core/spec/loader.go (2)
  • LoadOption (41-41)
  • WithBaseConfig (44-48)
internal/test/coordinator.go (4)
internal/service/coordinator/handler.go (2)
  • HandlerConfig (72-84)
  • NewHandler (94-105)
internal/test/helper.go (2)
  • WithStatusPersistence (84-88)
  • WithLogPersistence (92-96)
internal/core/exec/dagrun.go (1)
  • DAGRunStore (23-52)
internal/runtime/runner.go (1)
  • Config (96-110)
internal/cmd/coord.go (2)
internal/service/coordinator/handler.go (2)
  • NewHandler (94-105)
  • HandlerConfig (72-84)
internal/core/exec/dagrun.go (1)
  • DAGRunStore (23-52)
internal/cmd/context.go (1)
internal/cmn/config/config.go (3)
  • Worker (336-341)
  • Core (57-85)
  • Peer (363-387)
internal/service/coordinator/handler_test.go (1)
internal/service/coordinator/handler.go (2)
  • NewHandler (94-105)
  • HandlerConfig (72-84)
internal/service/coordinator/handler.go (3)
internal/core/exec/dagrun.go (1)
  • DAGRunStore (23-52)
proto/coordinator/v1/coordinator.pb.go (3)
  • Task (368-389)
  • Task (402-402)
  • Task (417-419)
internal/core/status.go (1)
  • NotStarted (7-7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test on ubuntu-latest
  • GitHub Check: Build
🔇 Additional comments (22)
ui/src/features/dags/components/dag-details/DAGStepTable.tsx (1)

36-36: LGTM!

Removing the p-px padding aligns with the broader overflow handling refinements in this PR. The 1px padding was unnecessary for this table container and its removal helps achieve consistent spacing. Based on coding guidelines, this avoids excessive whitespace between elements.

ui/src/features/dags/components/dag-details/DAGDetailsPanel.tsx (1)

222-222: LGTM - Proper overflow containment.

Adding overflow-x-hidden at the panel level creates a clean overflow boundary while child tables (with overflow-x-auto) maintain their own horizontal scrolling capability. This follows the coding guideline to use overflow-hidden to prevent layout breaks. The layered approach ensures horizontal scroll is contained within the intended table areas.

ui/src/features/dags/components/dag-details/NodeStatusTable.tsx (1)

46-46: LGTM!

Consistent with the p-px removal in DAGStepTable.tsx. The horizontal scroll capability is preserved via overflow-x-auto, and the inner min-w-[900px] constraint ensures proper table rendering at intermediate viewport sizes.

internal/runtime/executor/dag_runner.go (2)

365-367: LGTM! Good addition of consecutive error tracking.

The introduction of maxConsecutiveErrors and the counter improves resilience by detecting when the coordinator becomes unreachable, preventing indefinite polling.

One consideration: if network conditions vary significantly across deployments, this threshold could be made configurable (e.g., via an option or environment variable). However, 10 consecutive failures at 1-second intervals (~10 seconds) is a reasonable default for most scenarios.


426-436: LGTM! Error handling logic is correct.

The implementation properly:

  1. Increments the counter on each failure
  2. Logs a warning with the current consecutive error count for observability
  3. Returns an informative error when the threshold is reached (wrapping the original error)
  4. Resets the counter on success to avoid penalizing transient failures
internal/persis/filedagrun/writer.go (2)

131-134: Consider the performance impact of per-write fsync.

Calling file.Sync() after every write ensures durability for multi-coordinator visibility, but incurs significant syscall overhead. For high-frequency status updates, this could become a bottleneck.

If write frequency is expected to be high, consider:

  • Batching writes before sync
  • Using a configurable sync interval
  • Documenting the expected write frequency and performance characteristics

If the trade-off is acceptable for your use case (correctness over throughput), this is fine as-is.


102-137: Clean error handling refactor.

The simplified error handling with direct returns improves readability. Each error is properly wrapped with context, and the control flow is clear and idiomatic.

internal/service/worker/handler.go (1)

43-62: LGTM!

The temporary DAG file creation and cleanup logic is correct. The deferred cleanup with os.IsNotExist check properly handles edge cases. Logging only the temporary file path (instead of both original and temp) simplifies output and aligns with the shared-nothing architecture where the coordinator provides definitions.

internal/cmn/config/definition.go (1)

186-194: LGTM!

The new retry configuration fields are well-documented with clear default values. Using string for RetryInterval (to be parsed to time.Duration during config loading) follows the existing pattern in other definition structs like MonitoringDef.Retention and SchedulerDef.LockStaleThreshold.

internal/service/worker/remote_handler.go (1)

209-226: LGTM!

The comments clearly document the architectural rationale for not including DAGsDir in shared-nothing mode. The load options are correctly composed:

  • WithBaseConfig preserves base configuration inheritance
  • WithName ensures the original DAG name is used (not the temp file path)
  • WithParams passes through task parameters for parallel execution
internal/service/worker/worker_test.go (1)

94-99: LGTM!

Tests are correctly updated to include the Definition field, aligning with the new validation in Dispatch that requires task.Definition for distributed execution. The YAML definition is minimal but valid, sufficient for test purposes.

Also applies to: 133-138, 204-209, 259-265, 450-456, 480-487, 637-641, 699-703, 771-775

internal/service/coordinator/handler.go (4)

71-105: LGTM!

The refactor from functional options to HandlerConfig struct improves clarity and makes the constructor easier to use. The applyDefaults method correctly handles the optional StaleHeartbeatThreshold field.


186-218: LGTM!

The validation requiring task.Definition ensures shared-nothing workers always receive the full DAG definition. The attempt creation logic appropriately distinguishes root DAGs from sub-DAGs. The design choice to log warnings but not fail dispatch when attempt creation fails provides good resilience—tasks can still execute, and the warning alerts operators to storage issues.


264-270: LGTM!

The nil dagRunStore guards provide good defense-in-depth for test scenarios. The writeInitialStatus helper elegantly prevents "corrupted status file" errors by ensuring the file is never empty.

Minor note: StartedAt is set at dispatch time rather than actual execution start, but this is acceptable since workers will overwrite the status with accurate timestamps once execution begins.

Also applies to: 346-351, 393-406


325-327: LGTM!

Initial status writes are correctly placed after attempt.Open() succeeds but before caching, ensuring consistent behavior for both root DAG and sub-DAG paths.

Also applies to: 376-378

internal/service/coordinator/handler_test.go (1)

231-231: LGTM! Consistent migration to struct-based HandlerConfig.

The test file correctly updates all NewHandler calls to use the new HandlerConfig{} struct-based API. The changes are consistent across all test cases, and the mock implementations (mockDAGRunStore, mockDAGRunAttempt) are well-designed with proper mutex usage for thread-safety.

Also applies to: 248-248, 280-280, 305-312

internal/cmn/config/config.go (1)

378-386: LGTM! Well-documented retry configuration fields.

The new MaxRetries and RetryInterval fields are clearly documented with exponential backoff behavior. The zero-value semantics (disabled retries) are appropriate, and defaults are correctly applied at the usage sites in loader.go and context.go.

internal/cmd/context.go (2)

167-182: LGTM! Shared-nothing worker mode is well-implemented.

The early return path for shared-nothing workers correctly:

  1. Logs the mode with coordinator addresses for debugging
  2. Returns a minimal Context with nil stores
  3. Documents why stores are nil (status pushed to coordinator, DAG definitions from task payload)

The approach avoids unnecessary file I/O in distributed worker scenarios.


310-316: LGTM! Retry configuration applied correctly.

The conditional application of MaxRetries and RetryInterval only when positive values are configured preserves the defaults from coordinator.DefaultConfig() while allowing explicit overrides.

internal/test/coordinator.go (1)

58-68: LGTM! Test helper correctly migrated to config-based API.

The test setup properly:

  1. Initializes an empty HandlerConfig
  2. Conditionally populates DAGRunStore and LogDir based on test options
  3. Passes the config to NewHandler

This aligns with the new API in internal/service/coordinator/handler.go.

internal/cmd/coord.go (1)

197-200: LGTM! Coordinator handler correctly initialized with config.

The HandlerConfig is properly populated with:

  • DAGRunStore: for status persistence in shared-nothing mode
  • LogDir: for remote log streaming

This aligns with the documented requirements in HandlerConfig for shared-nothing worker architecture.

internal/cmd/progress_remote.go (1)

48-60: LGTM! Terminal width detection with sensible defaults.

Good defensive programming:

  1. Default width of 80 characters
  2. Only attempts to get terminal size when TTY is detected
  3. Validates the returned width before using it

@yottahmd yottahmd merged commit e623f5d into main Jan 14, 2026
6 checks passed
@yottahmd yottahmd deleted the fix-distributed-exec-err branch January 14, 2026 16:13
@codecov
Copy link

codecov bot commented Jan 14, 2026

Codecov Report

❌ Patch coverage is 43.96887% with 144 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.63%. Comparing base (732a891) to head (d6444e3).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/service/coordinator/handler.go 29.31% 39 Missing and 2 partials ⚠️
internal/cmd/progress_remote.go 0.00% 32 Missing ⚠️
internal/cmd/context.go 36.00% 13 Missing and 3 partials ⚠️
internal/persis/filedagrun/writer.go 16.66% 5 Missing and 5 partials ⚠️
internal/runtime/executor/dag_runner.go 0.00% 8 Missing ⚠️
internal/runtime/transform/status.go 57.14% 5 Missing and 1 partial ⚠️
internal/persis/filedag/store.go 54.54% 4 Missing and 1 partial ⚠️
internal/core/exec/dagrun.go 0.00% 4 Missing ⚠️
internal/runtime/agent/agent.go 0.00% 4 Missing ⚠️
internal/service/worker/handler.go 66.66% 2 Missing and 2 partials ⚠️
... and 5 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1580      +/-   ##
==========================================
- Coverage   64.87%   64.63%   -0.24%     
==========================================
  Files         255      255              
  Lines       28430    28519      +89     
==========================================
- Hits        18444    18434      -10     
- Misses       8337     8429      +92     
- Partials     1649     1656       +7     
Files with missing lines Coverage Δ
internal/cmd/coord.go 74.59% <100.00%> (ø)
internal/cmd/dry.go 77.46% <100.00%> (+0.65%) ⬆️
internal/cmd/migrate.go 54.21% <100.00%> (ø)
internal/cmd/restart.go 58.87% <100.00%> (+0.67%) ⬆️
internal/cmd/retry.go 72.52% <100.00%> (+0.93%) ⬆️
internal/cmd/start.go 37.33% <100.00%> (+0.61%) ⬆️
internal/cmd/status.go 71.42% <100.00%> (ø)
internal/cmn/config/config.go 85.71% <ø> (ø)
internal/core/exec/runstatus.go 0.00% <ø> (ø)
internal/service/worker/poller.go 100.00% <ø> (ø)
... and 16 more

... and 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 732a891...d6444e3. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants