v0.7.0
Codec-agnostic per-file statistics and nil metadata coalescing
Summary
v0.7.0 adds per-file column statistics to manifests (enabling pruning workflows without opening data files) and relaxes nil metadata handling across all write paths to coalesce to empty instead of returning an error.
Highlights
- Per-file column statistics: New
StatisticalCodecandStatisticalStreamEncoderinterfaces allow any codec to report per-file column stats (min, max, null count, distinct count) persisted onFileRef - Parquet statistics: The Parquet codec implements
StatisticalCodec, reporting column-level min/max/null count for all orderable types (int32, int64, float32, float64, string, timestamp) - New public types:
FileStats,ColumnStatson the public API surface - Nil metadata coalescing:
Write,StreamWrite,StreamWriteRecords, andVolume.Commitnow coalesce nil metadata toMetadata{}instead of returning an error - Contract updates:
CONTRACT_CORE,CONTRACT_WRITE_API,CONTRACT_VOLUME, andCONTRACT_PARQUETupdated to reflect new semantics - 14 new stats tests and 4 updated coalescing tests with full traceability matrix coverage
Upgrade Notes
- Callers that previously passed
Metadata{}solely to avoid nil errors can now passnilsafely - Callers that relied on nil metadata returning an error should remove that expectation
- Per-file stats are opt-in: only codecs implementing
StatisticalCodecproduce them; manifests without stats remain valid
References
Full Changelog: v0.6.0...v0.7.0
What's Changed
- docs(agents): 📝 enhance AGENTS.md with Go style and composition guardrails by @justapithecus in #102
- feat(manifest): ✨ add per-file column statistics for codec-agnostic pruning by @justapithecus in #103
- chore(api): 🩹 post-stats housekeeping and nil metadata coalescing by @justapithecus in #104
- docs: 📝 backfill CHANGELOG for v0.6.0 and v0.7.0 by @justapithecus in #105
Full Changelog: v0.6.0...v0.7.0