Skip to content

feat(wal): add crash-safe write-ahead log package#186

Open
derekbit wants to merge 2 commits into
longhorn:mainfrom
derekbit:issue-13259
Open

feat(wal): add crash-safe write-ahead log package#186
derekbit wants to merge 2 commits into
longhorn:mainfrom
derekbit:issue-13259

Conversation

@derekbit

@derekbit derekbit commented Jun 4, 2026

Copy link
Copy Markdown
Member

Which issue(s) this PR fixes:

Issue longhorn/longhorn#13259

What this PR does / why we need it:

Append-only WAL with binary framing (magic+ver+type+len+CRC32C), fdatasync-per-record durability, gofrs/flock cross-process exclusion, torn-tail truncation on Open, RecTxnPrepare gate for atomic intent-set durability, and OpenWithQuarantine for unreadable-file fallback.

Designed to support snapshot-chain transactional recovery for the longhorn-engine replica, but the package is generic.

Special notes for your reviewer:

Additional documentation or context

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new wal package providing a crash-safe, append-only write-ahead log intended to support snapshot-chain transactional recovery workflows (e.g., longhorn-engine replica directory operations), with per-record sync durability, torn-tail detection/truncation, and cross-process exclusion via file locking.

Changes:

  • Added wal journal implementation with binary framing + CRC32C validation, scanning utilities, and quarantine-on-open fallback.
  • Added recovery analysis (Analyze) to derive pending transactions/intents/step status from scanned records.
  • Updated dependencies (notably gofrs/flock) and vendored modules; bumped testify and updated x/exp.

Reviewed changes

Copilot reviewed 7 out of 31 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
wal/types.go Defines WAL operation/action enums and stable on-disk record/payload types.
wal/testing.go Adds a testing-only helper to simulate abrupt journal shutdown without checkpointing.
wal/recovery.go Implements journal record analysis to identify pending/incomplete transactions for recovery.
wal/recovery_test.go Adds unit tests for recovery analysis behavior and error cases.
wal/journal.go Implements the WAL journal (framing, CRC, torn-tail truncation, locking, scan, checkpoint, quarantine open).
wal/journal_test.go Adds end-to-end tests for journal behavior (round-trip, torn tail, CRC truncation, flock exclusion, ScanFile, quarantine).
go.mod Adds github.com/gofrs/flock and bumps testify / updates x/exp.
go.sum Updates checksums for new/updated dependencies.
vendor/modules.txt Records vendored module versions (adds gofrs/flock, updates testify, updates x/exp).
vendor/github.com/stretchr/testify/assert/yaml/yaml_fail.go Vendored testify update (build tag/header changes).
vendor/github.com/stretchr/testify/assert/yaml/yaml_default.go Vendored testify update (build tag/header changes).
vendor/github.com/stretchr/testify/assert/yaml/yaml_custom.go Vendored testify update (build tag/header changes).
vendor/github.com/stretchr/testify/assert/http_assertions.go Vendored testify update (error message formatting improvements).
vendor/github.com/stretchr/testify/assert/doc.go Vendored testify update (documentation note).
vendor/github.com/stretchr/testify/assert/assertions.go Vendored testify update (caller info, Empty semantics, error-chain formatting, etc.).
vendor/github.com/stretchr/testify/assert/assertion_order.go Vendored testify update (type formatting in failure messages).
vendor/github.com/stretchr/testify/assert/assertion_forward.go Vendored testify update (forwarders/docs for new semantics/APIs).
vendor/github.com/stretchr/testify/assert/assertion_format.go Vendored testify update (formatted assertion helpers/docs).
vendor/github.com/stretchr/testify/assert/assertion_compare.go Vendored testify update (preformatted failure messages; type formatting).
vendor/github.com/gofrs/flock/SECURITY.md Adds vendored gofrs/flock security policy documentation.
vendor/github.com/gofrs/flock/README.md Adds vendored gofrs/flock README.
vendor/github.com/gofrs/flock/Makefile Adds vendored gofrs/flock build/test targets.
vendor/github.com/gofrs/flock/LICENSE Adds vendored gofrs/flock license.
vendor/github.com/gofrs/flock/flock.go Adds vendored gofrs/flock core implementation.
vendor/github.com/gofrs/flock/flock_windows.go Adds vendored gofrs/flock Windows locking implementation.
vendor/github.com/gofrs/flock/flock_unix.go Adds vendored gofrs/flock Unix flock-based implementation.
vendor/github.com/gofrs/flock/flock_unix_fcntl.go Adds vendored gofrs/flock fcntl-based implementation for specific Unix platforms.
vendor/github.com/gofrs/flock/flock_others.go Adds vendored gofrs/flock stub implementation for unsupported platforms.
vendor/github.com/gofrs/flock/build.sh Adds vendored gofrs/flock cross-platform build script.
vendor/github.com/gofrs/flock/.golangci.yml Adds vendored gofrs/flock linter configuration.
vendor/github.com/gofrs/flock/.gitignore Adds vendored gofrs/flock gitignore.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread wal/journal.go Outdated
Comment thread wal/journal.go Outdated
Comment thread wal/journal.go
Comment thread wal/recovery.go
@derekbit derekbit force-pushed the issue-13259 branch 2 times, most recently from 96a4077 to 56d3dc5 Compare June 6, 2026 02:25
@codecov

codecov Bot commented Jun 6, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 69.09091% with 136 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.30%. Comparing base (02405f5) to head (9a66791).

Files with missing lines Patch % Lines
wal/journal.go 66.86% 68 Missing and 44 partials ⚠️
wal/recovery.go 74.28% 12 Missing and 6 partials ⚠️
wal/testing.go 57.14% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #186      +/-   ##
==========================================
- Coverage   74.51%   73.30%   -1.22%     
==========================================
  Files          42       46       +4     
  Lines        2017     2457     +440     
==========================================
+ Hits         1503     1801     +298     
- Misses        369      452      +83     
- Partials      145      204      +59     
Flag Coverage Δ
unittests 73.30% <69.09%> (-1.22%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@derekbit derekbit force-pushed the issue-13259 branch 2 times, most recently from 1b0b341 to 93d4cf4 Compare June 6, 2026 08:51
Append-only WAL with binary framing (magic+ver+type+len+CRC32C),
fdatasync-per-record durability, gofrs/flock cross-process exclusion,
torn-tail truncation on Open, RecTxnPrepare gate for atomic intent-set
durability, and OpenWithQuarantine for unreadable-file fallback.

Designed to support snapshot-chain transactional recovery for the
longhorn-engine replica, but the package is generic.

Longhorn 13259

Signed-off-by: Derek Su <derek.su@suse.com>
Signed-off-by: Derek Su <derek.su@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants