-
Couldn't load subscription status.
- Fork 118
Open
Description
Describe the bug
Delta Lake (delta-kernel-rs) operations cannot be tested with Python mocking frameworks. When running the below code, it gets stuck at transaction_version() and throws an error
_internal.DeltaError: Kernel error: Error interacting with object store: Generic S3 error: Error performing GET http://127.0.0.1:5555/test/test_delta_table/_delta_log/00000000000000000000.json in 180.961982391s, after 5 retries, max_retries: 10, retry_timeout: 180s - HTTP error: error sending request
To Reproduce
Run the below code
python -m pytest test.py::test_with_threaded_moto_server -v -s
import os
import polars as pl
from botocore.session import Session
from deltalake import DeltaTable, write_deltalake
from moto.server import ThreadedMotoServer
def test_with_threaded_moto_server():
"""
Attempt to use delta-rs with moto's ThreadedMotoServer.
Expected: Delta table operations should work with the mock S3 server.
Actual: Operations timeout or fail because delta-rs tries to connect to real AWS.
"""
HOST = "127.0.0.1"
PORT = 5556
endpoint_uri = f"http://{HOST}:{PORT}/"
# Start the Moto server
server = ThreadedMotoServer(ip_address=HOST, port=PORT)
server.start()
# Set AWS credentials
os.environ["AWS_SECRET_ACCESS_KEY"] = "testing"
os.environ["AWS_ACCESS_KEY_ID"] = "testing"
os.environ["AWS_SECURITY_TOKEN"] = "testing"
os.environ["AWS_SESSION_TOKEN"] = "testing"
os.environ["AWS_REGION"] = "us-east-1"
# Create S3 client and bucket using boto3 (this works fine)
client = Session().create_client(service_name="s3", endpoint_url=endpoint_uri)
bucket_name = "test-delta-bucket"
client.create_bucket(Bucket=bucket_name)
# Try to create a Delta table
test_data = pl.DataFrame(
{
"id": [1, 2, 3],
"name": ["Alice", "Bob", "Charlie"],
}
)
table_path = f"s3://{bucket_name}/test_delta_table"
storage_options = {
"allow_http": "true",
"endpoint_url": endpoint_uri,
}
write_deltalake(
table_or_uri=table_path,
data=test_data.to_arrow(),
mode="append",
storage_options=storage_options,
)
delta_table = DeltaTable(table_path, storage_options=storage_options)
#Below will fail
delta_table.transaction_version('1')
### Expected behavior
The code should not get stuck and return a transaction version if it exists or None
### Additional context
_No response_
Metadata
Metadata
Assignees
Labels
No labels