Skip to content

Delta-kernel-rs cannot be used with Python mocking frameworks for testing #1410

@vsmanish1772

Description

@vsmanish1772

Describe the bug

Delta Lake (delta-kernel-rs) operations cannot be tested with Python mocking frameworks. When running the below code, it gets stuck at transaction_version() and throws an error
_internal.DeltaError: Kernel error: Error interacting with object store: Generic S3 error: Error performing GET http://127.0.0.1:5555/test/test_delta_table/_delta_log/00000000000000000000.json in 180.961982391s, after 5 retries, max_retries: 10, retry_timeout: 180s - HTTP error: error sending request

To Reproduce

Run the below code

python -m pytest test.py::test_with_threaded_moto_server -v -s
import os

import polars as pl
from botocore.session import Session
from deltalake import DeltaTable, write_deltalake
from moto.server import ThreadedMotoServer


def test_with_threaded_moto_server():
    """
    Attempt to use delta-rs with moto's ThreadedMotoServer.

    Expected: Delta table operations should work with the mock S3 server.
    Actual: Operations timeout or fail because delta-rs tries to connect to real AWS.
    """

    HOST = "127.0.0.1"
    PORT = 5556
    endpoint_uri = f"http://{HOST}:{PORT}/"

    # Start the Moto server
    server = ThreadedMotoServer(ip_address=HOST, port=PORT)
    server.start()

      # Set AWS credentials
      os.environ["AWS_SECRET_ACCESS_KEY"] = "testing"
      os.environ["AWS_ACCESS_KEY_ID"] = "testing"
      os.environ["AWS_SECURITY_TOKEN"] = "testing"
      os.environ["AWS_SESSION_TOKEN"] = "testing"
      os.environ["AWS_REGION"] = "us-east-1"

      # Create S3 client and bucket using boto3 (this works fine)
      client = Session().create_client(service_name="s3", endpoint_url=endpoint_uri)
      bucket_name = "test-delta-bucket"
      client.create_bucket(Bucket=bucket_name)

      # Try to create a Delta table
      test_data = pl.DataFrame(
          {
              "id": [1, 2, 3],
              "name": ["Alice", "Bob", "Charlie"],
          }
      )

      table_path = f"s3://{bucket_name}/test_delta_table"

      storage_options = {
          "allow_http": "true",
          "endpoint_url": endpoint_uri,
      }

      write_deltalake(
          table_or_uri=table_path,
          data=test_data.to_arrow(),
          mode="append",
          storage_options=storage_options,
      )

       delta_table = DeltaTable(table_path, storage_options=storage_options)
        #Below will fail
        delta_table.transaction_version('1')




### Expected behavior

The code should not get stuck and return a transaction version if it exists or None

### Additional context

_No response_

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions