Conversation
📝 WalkthroughWalkthroughThis PR introduces a pluggable cache store abstraction for upgrade checking, replacing direct file-based caching with an injectable CacheStore interface. It adds a file-based implementation, implements retry policies with exponential backoff for network operations, enhances error handling during download and installation, and integrates the store throughout the upgrade flow. Changes
Sequence DiagramsequenceDiagram
actor User
participant Frontend as Frontend Server
participant UpgradeFlow as Upgrade Flow
participant CacheStore as CacheStore
participant GitHub as GitHub API
participant Download as Download Manager
User->>Frontend: Check for updates
Frontend->>CacheStore: Load()
CacheStore-->>Frontend: cached info or nil
alt Cache valid
Frontend-->>User: Return cached update info
else Cache expired or missing
Frontend->>UpgradeFlow: CheckAndUpdateCache(store, version)
UpgradeFlow->>CacheStore: Load()
CacheStore-->>UpgradeFlow: nil or stale cache
loop Retry with exponential backoff
UpgradeFlow->>GitHub: GetLatestRelease()
GitHub-->>UpgradeFlow: Release info or retriable error
end
UpgradeFlow->>CacheStore: Save(cache with LastCheck)
CacheStore-->>UpgradeFlow: ✓
UpgradeFlow-->>Frontend: Update available
Frontend-->>User: Display update available
end
opt User triggers upgrade
User->>UpgradeFlow: Download + Install
loop Retry download on server errors
UpgradeFlow->>Download: GET binary with retry policy
Download->>GitHub: HEAD for content-length
GitHub-->>Download: Response with size
Download-->>UpgradeFlow: Binary or retriable error
end
UpgradeFlow->>CacheStore: Save(cache with LastCheck)
CacheStore-->>UpgradeFlow: ✓
UpgradeFlow-->>User: Upgrade complete
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@internal/upgrade/upgrade_test.go`:
- Around line 1621-1626: The HTTP handler passed to httptest.NewServer declares
an unused parameter named `r` which the linter flags; update the handler
signature to use the blank identifier (e.g., change `func(w http.ResponseWriter,
r *http.Request)` to use `_ *http.Request`) or explicitly reference it (e.g., `_
= r`) so the `server` handler in the test no longer has an unused `r` parameter.
- Around line 1593-1602: The HTTP test server handler passed to
httptest.NewServer uses an unused parameter named r which the revive linter
flags; update the anonymous handler func signature in the server variable (the
http.HandlerFunc passed to httptest.NewServer) to replace the unused parameter
name r with the blank identifier _ so the function becomes func(w
http.ResponseWriter, _ *http.Request) and the rest of the logic (incrementing
attempts, returning StatusServiceUnavailable, and encoding release) remains
unchanged.
In `@internal/upgrade/upgrade.go`:
- Around line 341-351: When VerifyBinary(execPath, info.Release.TagName) fails
and you attempt to restore using copyFile(restoreSrc, execPath), capture the
copyFile error into a named variable (e.g., restoreErr) and include its details
when returning the final error instead of only wrapping the original
verification error; update the return paths around the VerifyBinary failure to
return fmt.Errorf("upgrade verification failed (restored backup): %v: %w",
restoreErr, err) or similar so both restoreErr and err are visible, referencing
VerifyBinary, copyFile, execPath, result.BackupPath, and internalBackupPath to
locate the logic.
🧹 Nitpick comments (10)
internal/upgrade/install.go (1)
215-253: Windows replacement logic is now significantly more robust.The two-rename approach correctly minimizes the vulnerable window compared to a full copy. Rollback on line 246 is appropriately best-effort.
One observation: Lines 216–232 are nearly identical to
replaceUnixBinarylines 184–201 (create temp → copy → chmod). Consider extracting a small helper likeprepareTempBinary(src, dir string, perm os.FileMode) (string, error)to eliminate the duplication.,
♻️ Optional: extract shared prep logic
// prepareTempBinary copies src into a new temp file in dir and applies perm. // On success it returns the temp file path; on failure it cleans up. func prepareTempBinary(src, dir string, perm os.FileMode) (string, error) { tempFile, err := os.CreateTemp(dir, "dagu-new-*") if err != nil { return "", fmt.Errorf("failed to create temp file: %w", err) } tempPath := tempFile.Name() _ = tempFile.Close() if err := copyFile(src, tempPath); err != nil { _ = os.Remove(tempPath) return "", err } if err := os.Chmod(tempPath, perm); err != nil { _ = os.Remove(tempPath) return "", fmt.Errorf("failed to set permissions: %w", err) } return tempPath, nil }Then both
replaceUnixBinaryandreplaceWindowsBinarywould call:tempPath, err := prepareTempBinary(src, filepath.Dir(target), perm) if err != nil { return err }internal/persis/fileupgradecheck/store_test.go (2)
1-10: Consider usingstretchr/testify/requirefor assertions.The test file uses manual
if err != nil { t.Fatalf(...) }patterns throughout. This makes tests verbose and less readable compared torequire.NoError(t, err)/require.Equal(t, ...).Example refactor for TestSaveAndLoad
import ( - "os" - "path/filepath" "testing" "time" "github.com/dagu-org/dagu/internal/upgrade" + "github.com/stretchr/testify/require" )func TestSaveAndLoad(t *testing.T) { tmpDir := t.TempDir() store, err := New(tmpDir) - if err != nil { - t.Fatalf("New() error: %v", err) - } + require.NoError(t, err) cache := &upgrade.UpgradeCheckCache{...} - if err := store.Save(cache); err != nil { - t.Fatalf("Save() error: %v", err) - } + require.NoError(t, store.Save(cache)) loaded, err := store.Load() - if err != nil { - t.Fatalf("Load() error: %v", err) - } - if loaded == nil { - t.Fatal("Load() returned nil after save") - } - if loaded.LatestVersion != cache.LatestVersion { - t.Errorf(...) - } + require.NoError(t, err) + require.NotNil(t, loaded) + require.Equal(t, cache.LatestVersion, loaded.LatestVersion) + require.Equal(t, cache.CurrentVersion, loaded.CurrentVersion) + require.Equal(t, cache.UpdateAvailable, loaded.UpdateAvailable) + require.True(t, loaded.LastCheck.Equal(cache.LastCheck)) }As per coding guidelines,
**/*_test.go: "Usestretchr/testify/requirefor assertions and shared fixtures frominternal/testinstead of duplicating mocks".
116-138:TestSaveAtomicWriteonly verifies file existence, not atomicity.The test name suggests it validates atomic write semantics, but it only checks
os.Staton the final path. Consider either renaming toTestSaveCreatesFileor enhancing it to verify atomicity (e.g., confirm no partial writes by checking content validity, or verifying that a concurrent reader never sees a truncated file).internal/upgrade/retry.go (2)
50-60: 5xx retry range is limited to 500–504; status codes ≥ 505 won't be retried.Codes like 502/503/504 are the common transient ones, so this is likely intentional. Just flagging that a broader 5xx gateway error (e.g., 507, 520–529 from CDNs) would be treated as non-retriable since it doesn't match the
500 <= code <= 504check.
62-75: The doc comment says "other → non-retriable httpError" butclassifyResponsealways returns a plain*httpError.The non-retriability for 4xx is an emergent property of
isRetriableError, not of the error type returned here. The comment is slightly misleading — consider rewording it to clarify that retry eligibility is determined byisRetriableError, not by this function alone.internal/upgrade/download.go (3)
63-123: Double-close oftempFile: explicit close on Line 106 followed by deferred close on Line 71.On the success path,
tempFile.Close()is called explicitly at Line 106, then the defer at Line 71 calls it again. The second close returns an error that's silently discarded, so this is functionally safe — but it's a code smell. Consider settingtempFiletonilafter the explicit close or restructuring to avoid the double close.Suggested cleanup
defer func() { - _ = tempFile.Close() - if _, statErr := os.Stat(tempPath); statErr == nil { + if tempFile != nil { + _ = tempFile.Close() + } + if _, statErr := os.Stat(tempPath); statErr == nil { _ = os.Remove(tempPath) } }()And after explicit close:
if err := tempFile.Close(); err != nil { return &nonRetriableError{err: fmt.Errorf("failed to close temp file: %w", err)} } + tempFile = nil
50-50:SetTimeout(0)disables all HTTP timeouts — a single attempt can hang indefinitely.While the comment indicates this is intentional for large downloads, consider setting a generous connection/TLS timeout (e.g., 30s) separate from the overall transfer timeout. With
SetTimeout(0)and no context deadline, a stalled TCP connection during the TLS handshake or DNS resolution could block the retry loop forever.Resty supports
SetTransportto configureDialContexttimeouts independently of the read/write deadline, which would allow large transfers while still bounding the initial connection phase.
88-95: Non-200 success codes (e.g., 206 Partial Content) are treated as errors.
code != 200rejects all non-200 responses. For a fresh full download this is fine, but if future logic adds range requests, 206 would be incorrectly treated as a failure. Low risk given the current usage, just noting it.internal/upgrade/cache.go (2)
38-38:store.Load()error is silently discarded — consider logging it.If Load fails due to a persistent issue (e.g., filesystem permissions), every call will bypass the cache and hit GitHub, with no diagnostic breadcrumb. A debug-level log would help troubleshoot without changing the graceful-degradation behavior.
68-68:store.Save()error is silently discarded — same concern as Load.A failed Save means the next startup will re-fetch from GitHub. Logging the error at debug/warn level would surface persistent storage problems without changing control flow.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1646 +/- ##
==========================================
+ Coverage 69.86% 69.91% +0.05%
==========================================
Files 333 335 +2
Lines 37405 37440 +35
==========================================
+ Hits 26133 26178 +45
+ Misses 9198 9184 -14
- Partials 2074 2078 +4
... and 7 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Summary by CodeRabbit