-
Couldn't load subscription status.
- Fork 537
Closed
Labels
binding/rustIssues for the Rust crateIssues for the Rust cratebugSomething isn't workingSomething isn't workingstorage/awsAWS S3 storage relatedAWS S3 storage related
Milestone
Description
Bug Report
Environment
Delta-rs version: 0.20.1
Environment: Docker
Description
Issue: Fails to write data to AWS S3 using DynamoDB locking mechanism in version 0.20.1, but works in version 0.19.2.
Error Messages
-
First Execution Failure (table does not exists):
Traceback (most recent call last): File "/app/test.py", line 21, in <module> df.write_delta( File "/usr/local/lib/python3.11/site-packages/polars/dataframe/frame.py", line 4286, in write_delta write_deltalake( File "/usr/local/lib/python3.11/site-packages/deltalake/writer.py", line 323, in write_deltalake write_deltalake_rust( _internal.CommitFailedError: Transaction failed: dynamodb client failed to write log entry -
Subsequent Execution Failure (after it worked once, table already exists):
Traceback (most recent call last): File "/app/test.py", line 22, in <module> df.write_delta( File "/usr/local/lib/python3.11/site-packages/polars/dataframe/frame.py", line 4286, in write_delta write_deltalake( File "/usr/local/lib/python3.11/site-packages/deltalake/writer.py", line 302, in write_deltalake table.update_incremental() File "/usr/local/lib/python3.11/site-packages/deltalake/table.py", line 1258, in update_incremental self._table.update_incremental() _internal.DeltaError: Generic error: error in DynamoDb
How to Reproduce
Dockerfile:
FROM python:3.11
WORKDIR /app
RUN pip install deltalake==0.20.1 polars
# Uncomment to see it working
# RUN pip install deltalake==0.19.2
COPY test.py .
CMD [ "python", "test.py" ]test.py:
import polars
import os
df = polars.DataFrame({'x': [1, 2, 3]})
storage_options = {
'AWS_S3_LOCKING_PROVIDER': 'dynamodb',
'DELTA_DYNAMO_TABLE_NAME': 'delta_log',
'AWS_ACCESS_KEY_ID': os.environ["AWS_ACCESS_KEY_ID"],
'AWS_SECRET_ACCESS_KEY': os.environ["AWS_SECRET_ACCESS_KEY"],
'AWS_REGION': os.environ['AWS_REGION'],
}
df.write_delta(
f"s3://{os.environ['BUCKET_NAME']}/delta/test",
storage_options=storage_options,
)
# You will need a bucket and a DynamoDB table.
# How to create DynamoDB table?
# aws dynamodb create-table \
# --table-name delta_log \
# --attribute-definitions AttributeName=tablePath,AttributeType=S AttributeName=fileName,AttributeType=S \
# --key-schema AttributeName=tablePath,KeyType=HASH AttributeName=fileName,KeyType=RANGE \
# --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5Run the following commands:
docker build -t test:latest .
docker run \
-e AWS_ACCESS_KEY_ID=your_access_key \
-e AWS_SECRET_ACCESS_KEY=your_secret_key \
-e BUCKET_NAME=your_bucket_name \
-e AWS_REGION=your_region \
test:latestIf you uncomment line 8 in the Dockerfile and then execute docker build and docker run again, you will see that it works correctly with version 0.19.2
Reference: https://delta-io.github.io/delta-rs/integrations/object-storage/s3/
Metadata
Metadata
Assignees
Labels
binding/rustIssues for the Rust crateIssues for the Rust cratebugSomething isn't workingSomething isn't workingstorage/awsAWS S3 storage relatedAWS S3 storage related