-
Couldn't load subscription status.
- Fork 207
Description
When trying to access the delta tables on our ECS S3 delta storage via the delta sharing server we get the following error message:
Caused by: java.util.concurrent.ExecutionException: java.io.InterruptedIOException: doesBucketExist on daepi: com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to daepi.s3.amazonaws.com:443 [daepi.s3.amazonaws.com/52.217.110.180, daepi.s3.amazonaws.com/3.5.25.166, daepi.s3.amazonaws.com/52.217.204.193, daepi.s3.amazonaws.com/3.5.17.37, daepi.s3.amazonaws.com/52.217.167.145, daepi.s3.amazonaws.com/52.216.207.67, daepi.s3.amazonaws.com/3.5.9.128, daepi.s3.amazonaws.com/16.15.185.168] failed: connect timed out
We pass the following content in the sharing-config.yaml:
`
/# The format version of this config file
version: 1
/# Config shares/schemas/tables to share
shares:
- name: "daepi"
schemas:-
name: "default"
tables:- name: "demo_delta_table"
location: "s3a://daepi/demo_delta_table"
id: "00000-00000-0000-000000000000"
storage:
type: s3a
properties:
fs.s3a.access.key: "Access_key"
fs.s3a.secret.key: "Secret_key"
fs.s3a.endpoint: "ECS_Endpoint_URL"
fs.s3a.path.style.access: "true"
fs.s3a.connection.ssl.enabled: "false"
fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"
spark.hadoop.fs.s3a.access.key: "Access_key"
spark.hadoop.fs.s3a.secret.key: "Secret_key"
spark.hadoop.fs.s3a.endpoint: "ECS_Endpoint_URL"
spark.hadoop.fs.s3a.path.style.access: "true"
spark.hadoop.fs.s3a.connection.ssl.enabled: "false"
spark.hadoop.fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"
spark.hadoop.fs.s3a.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
spark.hadoop.fs.s3.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
spark.delta.sharing.network.debugLogging: "true" - name: "demo_delta_table"
-
host: "localhost"
port: 9999
endpoint: "/delta-sharing"
preSignedUrlTimeoutSeconds: 3600
deltaTableCacheSize: 10
stalenessAcceptable: false
evaluatePredicateHints: false
`
To run the docker image and start the delta server we use the following command:
docker run -p 9999:9999 -e AWS_ACCESS_KEY_ID=Access_key -e AWS_SECRET_ACCESS_KEY=Secret_key -e AWS_ENDPOINT_URL=ECS_Endpoint_URL -e AWS_ALLOW_HTTP=true --mount type=bind,source=./Coding/delta_sharing/sharing-config.yaml,target=/config/delta-sharing-server-config.yaml deltaio/delta-sharing-server:1.3.3 --config /config/delta-sharing-server-config.yaml
We then test the access with the following pyhton file, where the first part with listing the shares works but then the second part throws the error above on the delta server docker container:
`
import delta_sharing
import pandas as pd
import os
/# Load the sharing profile
profile_file = "delta-sharing-profile.json"
/# Create a client
client = delta_sharing.SharingClient(profile_file)
/# List shares
shares = client.list_shares()
print("Shares:", shares)
table_url = profile_file + "#daepi.default.demo_delta_table"
df = delta_sharing.load_as_pandas(table_url)
/# Display the DataFrame
print(df)
`
The delta-sharing-profile.json looks like this:
{ "shareCredentialsVersion": 1, "bearerToken": "", "endpoint": "http://localhost:9999/delta-sharing/" }
We also tested that we can connect from the delta sharing docker container to the ECS S3 so we do not have a Network issue.
Also we used the same setup but instead of using the ECS S3 we tested it with an AWS S3. In this case everything worked as expected.
From the error and all our trials it is now clear that the delta sharing server does not read the Endpoint URL correctly for the ECS S3 that we provide. Hence it tries to connect to AWS instead of our given Endpoint URL, which obviously doesn't work with the wrong credentials and and wrong delta_table etc.
Would it be possible to fix this issue that we can also connect to our ECS S3 via the delta server?