DeltaCAT 2.0
Release Notes
Initial implementation of core DeltaCAT 2.0 catalog APIs for Daft, Ray Data, Pandas, PyArrow, NumPy, and Polars.
Among other features, it provides:
- Inline copy-on-write table compaction and table properties to control automated compaction.
- Automatic/manual schema evolution support, and table properties to control table schema evolution behavior.
- Support for writing/reading both schemaless tables and tables with schemas.
- Full cross-catalog, recursive metadata copy and backfill support (e.g., to support easily backfilling major revisions to catalog metadata storage specification).
- Frontpage "overview"/"quickstart" documentation and more detailed Storage, Table, and Schema README doc pages.
- Multi-table/namespace/etc. transaction support (i.e., transactions that can operate over any number of objects within the bounds of a single catalog).
- Comprehensive, auto-generated (via new make type-mappings makefile target) reader/writer support matrix in reader_compatibility_mapping.py across all Arrow data types, supported dataset types (PyArrow, Pandas, Polars, NumPy, Daft, Ray Data), and supported content types with inline schema (Parquet, Avro, Orc, Feather). This allows us to quickly detect and short-circuit any write that would break a declared supported reader before persisting data or doing any computationally expensive work.
- Transaction log queries and time travel.
Full Changelog: 2.0.0b11...2.0.0.post1