-
Couldn't load subscription status.
- Fork 537
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Environment
Delta-rs version:
0.10.0
Binding:
Python
Environment:
- Cloud provider:
- OS: macOS
- Other:
Bug
delta-rs loses metadata for parquet written with pandas (example data is attached).
from deltalake import DeltaTable
import pyarrow.parquet as pq
if __name__ == "__main__":
# read it back with delta-rs
dt = DeltaTable("test.parquet")
print("\nDeltaTable schema:")
print(dt.schema().to_pyarrow().to_string())
# read it back with pyarrow
table = pq.read_table("test.parquet")
print("\nPyarrow schema:")
print(table.schema.to_string())This outputs:
DeltaTable schema:
col2: string
col1: int32
Pyarrow schema:
col2: dictionary<values=string, indices=int32, ordered=0>
col1: dictionary<values=int32, indices=int32, ordered=0>
-- schema metadata --
pandas: '{"index_columns": [{"kind": "range", "name": null, "start": 0, "' + 509
The schema metadata part in pyarrow.table is nowhere to be found in DeltaTable. Is it present, but not public? How can it be accessed?
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request