Skip to content

It should be possible to download a blob with content-encoding while keeping the content-encoding #34350

Open
@kosta

Description

@kosta

Is your feature request related to a problem? Please describe.

I put a blob onto an Azure Storage Container with content-encoding gzip. In python, I call something like

# plain text is 'hello from gzip'
gzipped_data = b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\xff\xcaH\xcd\xc9\xc9WH+\xca\xcfUH\xaf\xca,\x00\x00\x00\x00\xff\xff\x03\x00d\xaa\x8e\xb5\x0f\x00\x00\x00'
azure_blob = BlobClient(...)
content_settings = ContentSettings(content_encoding='gzip', ...)
azure_blob.upload_blob(data=gzipped_data, length=len(gzipped_data), content_settings=content_settings)

Now I have a Blob with the given Content-encoding. Downloading it via e.g. curl will yield the raw, compressed data. (Your browser will most likely decompress it for you)

When I download it again:

azure_blob = BlobClient(...)
blob_data = azure_blob.download_blob()
downloaded = blob_data.read()

Now, downloaded is b'hello from gzip', but I expected gzipped_data from the first code block.

Describe the solution you'd like

BlobClient.download_blob() should have the option to keep the content-encoding, e.g. by passing decompress=False. This seems to be in line with existing code in the Azure SDK.

Describe alternatives you've considered

For my use case, e.g. 1:1 copying from A to B, I do not see an alternative.

Additional context

The code is almost there. The StreamDownloadGenerator, which is being used under the hood, has code for it already. I just needs to be passed decompress=False.

However, in BlobOperations.download() the decompress kwarg is not passed. I tested locally to add _decompress = kwargs.pop("decompress", True) to the top of that function, and then pass decompress=_decompress to both instances of response.stream_download(). It seems to work.

BlobOperations is generated code. Please advise perform the modification above through code generation. I tried looking at https://github.com/Azure/azure-rest-api-specs/ but did not know where to start.

Metadata

Metadata

Labels

ClientThis issue points to a problem in the data-plane of the library.Service AttentionWorkflow: This issue is responsible by Azure service team.StorageStorage Service (Queues, Blobs, Files)customer-reportedIssues that are reported by GitHub users external to the Azure organization.feature-requestThis issue requires a new behavior in the product in order be resolved.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions