This document defines the required semantics for the write-facing API.
It is authoritative for any Dataset implementation.
- Deterministic snapshot creation.
- Explicit metadata on every write.
- Linear snapshot history.
- Safe, immutable commits.
- Query execution or filtering.
- Background compaction.
- Distributed coordination, consensus, or lock management.
Write(ctx, data, metadata)MUST create a new snapshot on success.nilmetadata MUST be coalesced to empty (Metadata{}).- Empty metadata is valid and MUST be persisted explicitly.
- The new snapshot MUST reference the previous snapshot as its parent (if any).
- Writes MUST NOT mutate existing snapshots or manifests.
- The manifest MUST include all required fields defined in
CONTRACT_CORE.md(including row/event count and min/max timestamp when applicable). - When the codec implements
StatisticalCodec, per-file statistics MUST be collected after encoding and recorded on the FileRef. - When no codec is configured, each write represents a single data unit and
the row/event count MUST be
1.
StreamWrite(ctx, metadata)MUST coalescenilmetadata to empty (Metadata{}).StreamWriteMUST return aStreamWriterfor a single binary data unit.StreamWriter.Commit(ctx)MUST write the manifest and return the new snapshot.- A snapshot MUST NOT be visible before
Commitwrites the manifest. StreamWriter.Abort(ctx)MUST ensure no manifest is written.StreamWriter.Close()withoutCommitMUST behave asAbort.- Streamed writes MUST be single-pass writes to the final object path.
- Streamed writes MUST NOT mutate existing objects; all writes are new paths.
- Streamed writes MUST set row/event count to
1. - When a checksum component is configured, the checksum MUST be computed during streaming and recorded in the manifest for each file written.
- When a codec is configured,
StreamWriteMUST return an error.
StreamWriteRecords(ctx, records, metadata)MUST coalescenilmetadata to empty (Metadata{}).StreamWriteRecordsMUST return an error if records iterator is nil.StreamWriteRecordsMUST consume records via a pull-based iterator.StreamWriteRecordsMUST return an error if the configured codec does not support streaming record encoding.StreamWriteRecordsMUST return an error if partitioning is configured (non-noop partitioner), since single-pass streaming cannot partition without buffering.- Streamed record writes MUST be single-pass writes to the final object path.
- Streamed record writes MUST NOT mutate existing objects; all writes are new paths.
- On success, the manifest is written and the new snapshot is returned.
- On error (iterator failure, codec error, storage error), no manifest is written and best-effort cleanup of partial objects is attempted.
- Row/event count MUST equal the total number of records consumed.
- When a checksum component is configured, the checksum MUST be computed during streaming and recorded in the manifest for each file written.
- When the stream encoder implements
StatisticalStreamEncoder, per-file statistics MUST be collected after stream finalization and recorded on the FileRef.
- When records implement the
Timestampedinterface,WriteMUST computeMinTimestampandMaxTimestampfrom the data. - Timestamps are extracted by calling
Timestamp()on each record that implements the interface. - Records that do not implement
Timestampedare ignored for timestamp computation. - When no records implement
Timestamped, both timestamp fields MUST benil. - Timestamp computation is explicit (via interface implementation), not inferred.
- The same timestamp rules apply to
StreamWriteRecordsas records are consumed.
- The first successful write creates the initial snapshot.
Latest()on an empty dataset MUST returnErrNoSnapshots.Snapshots()on an empty dataset MUST return an empty list without error.Snapshot(id)on an empty dataset MUST returnErrNotFound.
- A snapshot is visible only after its manifest is persisted.
- Manifest presence is the commit signal.
Snapshot history for a Dataset is linear.
| Pattern | Writers | Snapshots | History | Status |
|---|---|---|---|---|
| Single writer | 1 process | 1 per Write | Linear (guaranteed) | ✅ Supported |
| Multi-process, serialized | N processes, external coordination | N (one per writer) | Linear (caller-enforced) | ✅ Supported (caller owns coordination) |
| Multi-process, uncoordinated | N processes, no coordination | N (one per writer) | Linear (CAS-enforced) | ✅ Safe (CAS) |
| Parallel staging, single commit | 1 process, N goroutines | 1 (all shards merged) | Linear (guaranteed) | ❌ Not available |
When the storage adapter implements ConditionalWriter, Lode detects
concurrent commits and returns ErrSnapshotConflict.
Mechanism:
- At commit time, the writer captures the expected parent snapshot ID.
- When writing the latest pointer, if the store implements
ConditionalWriter, the commit usesCompareAndSwapinstead of Delete+Put. - If the pointer content differs from the expected parent (another writer
committed since
Latest()was read), the commit returnsErrSnapshotConflict. - If the store does not implement
ConditionalWriter, the current Delete+Put behavior applies and single-writer semantics remain the caller's responsibility.
Manual retry pattern:
- Receive
ErrSnapshotConflict. - Re-read
Latest()to get the current head. - Merge or rebuild state against the new head.
- Re-commit.
Data files are immutable and already persisted — retry cost is one manifest write plus one pointer swap.
Automatic retry (opt-in):
WithRetryCount(n) enables bounded automatic retry within the commit path.
On ErrSnapshotConflict, the writer:
- Sleeps with jittered exponential backoff.
- Calls
Latest()to refresh the in-memory CAS cache. - Re-resolves the parent and re-parents the manifest (same snapshot ID).
- Retries the pointer CAS.
Data files are written once; only the manifest and pointer are retried.
If all retries are exhausted, ErrSnapshotConflict is returned as usual.
Retry options: WithRetryCount(n), WithRetryBaseDelay(d),
WithRetryMaxDelay(d), WithRetryJitter(j). Defaults: 0 retries,
10ms base, 2s max, full jitter.
Activation:
- Always-on when the adapter implements
ConditionalWriter. - Silent fallback to Delete+Put when it does not.
- No configuration required.
Single-writer compatibility:
- Single-writer workflows are unaffected. CAS succeeds on every commit when no concurrent writer is active.
- External coordination remains valid but is no longer required when using a CAS-capable store.
Individual Store.Put calls to distinct paths are safe to issue concurrently.
Data files use unique snapshot-scoped paths and are immutable once written.
However, each Write or StreamWrite call creates a new snapshot, which
updates the latest pointer. The serialization point is the pointer update,
not the data write.
| Store capability | Concurrent snapshot creation | Guidance |
|---|---|---|
Implements ConditionalWriter |
Safe — CAS detects conflicts, returns ErrSnapshotConflict |
Caller retries with new parent |
Does not implement ConditionalWriter |
Unsafe — pointer race, undefined visibility order | Serialize writes externally |
Consumers who need to write multiple blob files as part of a single logical
operation should serialize snapshot creation (one Write or StreamWrite
per file, sequentially) or use a CAS-capable store and handle
ErrSnapshotConflict with retry logic.
Parallel staging (transaction API):
- Multiple goroutines stage data files concurrently within a single write transaction.
- A single atomic Commit produces one snapshot with all data files.
- Parallelizes the internal write pipeline without changing the model.
- Single process only; does not address multi-process concurrency.
Streaming writes (StreamWrite, StreamWriteRecords) that produce large payloads
may use the storage adapter's multipart/chunked upload path.
Backends with conditional completion support (e.g., S3 with If-None-Match):
- Both atomic and multipart paths provide atomic no-overwrite guarantees.
- No additional coordination required beyond the adapter.
Backends without conditional completion support:
- The multipart path has a TOCTOU window between preflight check and completion.
- Callers MUST ensure single-writer semantics per object path, OR
- Callers MUST use external coordination (distributed locks, queues, etc.)
See CONTRACT_STORAGE.md for adapter-specific thresholds and guarantees.
Implementations MUST use the following error sentinels where applicable:
ErrNoSnapshotsfor empty historyErrNotFoundfor missing snapshot IDErrPathExistswhen attempting to write to an existing pathErrCodecConfiguredwhenStreamWriteis called with a configured codecErrCodecNotStreamablewhenStreamWriteRecordscodec lacks streaming supportErrNilIteratorwhenStreamWriteRecordsis called with a nil iteratorErrPartitioningNotSupportedwhenStreamWriteRecordsis used with non-noop partitioningErrSnapshotConflictwhen another writer committed since parent was resolved (only when store implementsConditionalWriter)
Streaming-specific error taxonomy and handling guidance are defined in
CONTRACT_ERRORS.md.
See CONTRACT_COMPLEXITY.md for full definitions and variable glossary.
- In-memory cache (within process): 0 store calls.
- Persistent pointer (cold start): 2 store calls (1 Get + 1 Exists).
- Scan fallback (no pointer, backward compat): O(N) via List. Self-heals on completion.
Parent resolution MUST NOT use List when a valid pointer exists.
| Operation | Store Calls (warm) | Memory |
|---|---|---|
Write (unpartitioned) |
4 fixed | O(R + encoded) |
Write (P partitions) |
2P + 3 | O(R + encoded) |
StreamWrite |
4 fixed | O(1) streaming |
StreamWriteRecords |
4 fixed | O(1) streaming |
When the store implements ConditionalWriter, each commit adds +1 read
(the CompareAndSwap operation reads the current pointer before conditional write).
This is constant overhead per commit, not per record or per file.
Hot-path writes MUST NOT trigger Store.List.
- In-place mutation of existing data files
- Metadata inference or defaulting
- Branching or multi-head snapshot history