Open
Description
Summary
Delta Sharing is an open protocol for secure data sharing. This tracking issue intends to track the progress of implementing databend as a delta sharing consumer. After this feature is implemented, our users will be able to:
SELECT * from delta.example.ontime;
Implement databend as a delta sharing provider is far more complex, should be tracked in another issues.
NOTE: Implement this issue after #7816 will be easier.
Tasks
- Maybe need a delta sharing connecter in rust
- Add
delta
catalog - Implement
list databases
- Implement
list tables
- Implement
Table
API - Integration tests
- We can use delta-sharing-server for testing.
References
Protocol: https://github.com/delta-io/delta-sharing/blob/main/PROTOCOL.md
Highlighted APIs:
- List Shares: Get available shares
- List Schemas in a Share: List all schemas (databases in databend) in a share
- List Tables in a Schema: List all tables in a schema
- Query Table Version
- Query Table Metadata
- This API will return Table Metadata Format
{ "protocol": { "minReaderVersion": 1 } } { "metaData": { "id": "f8d5c169-3d01-4ca3-ad9e-7dc3355aedb2", "format": { "provider": "parquet" }, "schemaString": "{\"type\":\"struct\",\"fields\":[{\"name\":\"eventTime\",\"type\":\"timestamp\",\"nullable\":true,\"metadata\":{}},{\"name\":\"date\",\"type\":\"date\",\"nullable\":true,\"metadata\":{}}]}", "partitionColumns": [ "date" ] } }
- Read Data from a Table
- The most important API in delta sharing.
- Request
{ "predicateHints": [ "date >= '2021-01-01'", "date <= '2021-01-31'" ], "limitHint": 1000, "version": 123 }
- Response
{ "protocol": { "minReaderVersion": 1 } } { "metaData": { "id": "f8d5c169-3d01-4ca3-ad9e-7dc3355aedb2", "format": { "provider": "parquet" }, "schemaString": "{\"type\":\"struct\",\"fields\":[{\"name\":\"eventTime\",\"type\":\"timestamp\",\"nullable\":true,\"metadata\":{}},{\"name\":\"date\",\"type\":\"date\",\"nullable\":true,\"metadata\":{}}]}", "partitionColumns": [ "date" ] } } { "file": { "url": "https://<s3-bucket-name>.s3.us-west-2.amazonaws.com/delta-exchange-test/table2/date%3D2021-04-28/part-00000-8b0086f2-7b27-4935-ac5a-8ed6215a6640.c000.snappy.parquet?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20210501T010516Z&X-Amz-SignedHeaders=host&X-Amz-Expires=900&X-Amz-Credential=AKIAISZRDL4Q4Q7AIONA%2F20210501%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=97b6762cfd8e4d7e94b9d707eff3faf266974f6e7030095c1d4a66350cfd892e", "id": "8b0086f2-7b27-4935-ac5a-8ed6215a6640", "partitionValues": { "date": "2021-04-28" }, "size":573, "stats": "{\"numRecords\":1,\"minValues\":{\"eventTime\":\"2021-04-28T23:33:57.955Z\"},\"maxValues\":{\"eventTime\":\"2021-04-28T23:33:57.955Z\"},\"nullCount\":{\"eventTime\":0}}" } } { "file": { "url": "https://<s3-bucket-name>.s3.us-west-2.amazonaws.com/delta-exchange-test/table2/date%3D2021-04-28/part-00000-591723a8-6a27-4240-a90e-57426f4736d2.c000.snappy.parquet?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20210501T010516Z&X-Amz-SignedHeaders=host&X-Amz-Expires=899&X-Amz-Credential=AKIAISZRDL4Q4Q7AIONA%2F20210501%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=0f7acecba5df7652457164533a58004936586186c56425d9d53c52db574f6b62", "id": "591723a8-6a27-4240-a90e-57426f4736d2", "partitionValues": { "date": "2021-04-28" }, "size": 573, "stats": "{\"numRecords\":1,\"minValues\":{\"eventTime\":\"2021-04-28T23:33:48.719Z\"},\"maxValues\":{\"eventTime\":\"2021-04-28T23:33:48.719Z\"},\"nullCount\":{\"eventTime\":0}}" } }
- Databend needs to read the real from URL.