Skip to content

Unable to List Blobs in Azure via S3Proxy #717

Open
@samuel-davis

Description

@samuel-davis

Problem:
Utilizing S3Proxy configured to talk to ABS I cannot use the marker parameter when I attempt to ListBlobs. I am using a underlying python library in my codebase called deltalake , where the deltalake lib is calling S3Proxy.

The incoming request to S3Proxy from my code is :

http://df-s3-proxy/df-bucket?list-type=2&prefix=av2%2Fsilver8%2F_delta_log%2F&start-after=av2%2Fsilver8%2F_delta_log%2F00000000000000000000

The transformed request that is sent JCloud and then on to ABS in the cloud is :

https://vulcanforgetest.blob.core.windows.net/df-bucket?restype=container&comp=list&prefix=av2/silver8/_delta_log/&marker=av2/silver8/_delta_log/00000000000000000000&maxresults=1000&include=metadata

Finally the exception caused by the marker parameter is :

org.jclouds.azure.storage.AzureStorageResponseException: command [method=org.jclouds.azureblob.AzureBlobClient.public abstract org.jclouds.azureblob.domain.ListBlobsResponse org.jclouds.azureblob.AzureBlobClient.listBlobs(java.lang.String,org.jclouds.azureblob.options.ListBlobsOptions[])[df-bucket, [Lorg.jclouds.azureblob.options.ListBlobsOptions;@41358ef0], request=GET https://vulcanforgetest.blob.core.windows.net/df-bucket?restype=container&comp=list&prefix=av2/silver8/_delta_log/&marker=av2/silver8/_delta_log/00000000000000000000&maxresults=1000&include=metadata HTTP/1.1] failed with code 400, error: AzureError{requestId='9606be37-201e-0072-4006-3223a9000000', code='InvalidQueryParameterValue', message='Value for one of the query parameters specified in the request URI is invalid.
RequestId:9606be37-201e-0072-4006-3223a9000000
Time:2024-11-08T17:46:11.9123328Z', context='{QueryParameterValue=av2/silver8/_delta_log/00000000000000000000, QueryParameterName=marker, Reason=Invalid ListBlobs marker.}'}

You can see above that it basically boils down to this.

image

Ive looked through both the S3Proxy and the Jclouds code and because the error is so abstract and doesn't tell me why this marker parameter is invalid, Im reaching out for some help.

Im more than happy to do a PR if you can point to where this can be resolved.

I should also say, I attempted using the azureblob-sdk provider and while it DOES write data and gets past this error, the data that is written is unable to be read correctly afterwards. deltalake basically reporting that the file sizes are not what they should be ( smaller ). Which is implying to me that the write operation isn't working correctly even with azureblob-sdk

Environment where failure is seen:
Azure Blob Storage account configured with : azureSharedKey in S3Proxy
provider: azureblob

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions