Skip to content

[Storage] Change content iterating in _download_async.py, add decompression live tests against compressed dataset #39755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vincenttran-msft
Copy link
Member

  • Change from response.load_body() and response.body() in _download_async.py to access content, as this method does not respect the decompress param passed to download_blob API
    • This was resolved by using the iterator directly.
    • An if-else block was added to handle the general hotpath and the validate_content=True path (where the body would have been loaded previously for MD5 check)
  • Test cases were added for both sync and async for the compressed data + decompress=T/F scenario.

@github-actions github-actions bot added the Storage Storage Service (Queues, Blobs, Files) label Feb 15, 2025
@azure-sdk
Copy link
Collaborator

API change check

API changes are not detected in this pull request.

@vincenttran-msft
Copy link
Member Author

/azp run python - storage - tests

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -46,8 +46,11 @@
async def process_content(data: Any, start_offset: int, end_offset: int, encryption: Dict[str, Any]) -> bytes:
if data is None:
raise ValueError("Response cannot be None.")
await data.response.load_body()
content = cast(bytes, data.response.body())
if not data.response.is_stream_consumed:
Copy link
Member Author

@vincenttran-msft vincenttran-msft Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this exists on old transports... --> therefore may need to do an attribute check instead (but start off with seeing if there is any other way without attribute check)

await data.response.load_body()
content = cast(bytes, data.response.body())
if not data.response.is_stream_consumed:
content = b"".join([d async for d in data])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start-- then test against the 3 transports, then when those work test validate_content

@vincenttran-msft vincenttran-msft marked this pull request as draft April 10, 2025 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants