Open
Description
Hi oittaa,
Thanks for your continue effort on this project.
Describe the bug
I wanted to try chunk a large file and then upload to the gcp-storage-emulator.
This failed with ValueError. However, the real google cloud storage works fine.
import os
from pathlib import Path
from typing import Generator
from google.cloud import storage
CHUNK_SIZE = 1 * 1024 * 1024 # 1 MB
def chunk_file(file_full_path: str) -> Generator[bytes, None, None]:
file = Path(file_full_path)
with file.open("rb") as f:
while True:
chunk = f.read(CHUNK_SIZE)
if not chunk:
break
yield chunk
if __name__ == "__main__":
test_bucket_name = "test-bucket"
filepath = "train.jsonl"
blob_name = "train.jsonl"
os.environ["STORAGE_EMULATOR_HOST"] = "http://gcs:9023"
client = storage.Client()
bucket = client.bucket(test_bucket_name)
blob = bucket.blob(blob_name)
with blob.open("wb", chunk_size=CHUNK_SIZE) as blob_writer:
for piece in chunk_file(filepath):
blob_writer.write(piece)
for b in bucket.list_blobs():
print(b.name)
Traceback (most recent call last):
File "/app/main.py", line 33, in <module>
blob_writer.write(piece)
File "/root/.cache/pypoetry/virtualenvs/pythonproject-9TtSrW0h-py3.11/lib/python3.11/site-packages/google/cloud/storage/fileio.py", line 357, in write
self._upload_chunks_from_buffer(num_chunks)
File "/root/.cache/pypoetry/virtualenvs/pythonproject-9TtSrW0h-py3.11/lib/python3.11/site-packages/google/cloud/storage/fileio.py", line 417, in _upload_chunks_from_buffer
upload.transmit_next_chunk(transport, **kwargs)
File "/root/.cache/pypoetry/virtualenvs/pythonproject-9TtSrW0h-py3.11/lib/python3.11/site-packages/google/resumable_media/requests/upload.py", line 503, in transmit_next_chunk
method, url, payload, headers = self._prepare_request()
^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/pypoetry/virtualenvs/pythonproject-9TtSrW0h-py3.11/lib/python3.11/site-packages/google/resumable_media/_upload.py", line 611, in _prepare_request
raise ValueError("Upload has finished.")
ValueError: Upload has finished.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/app/main.py", line 31, in <module>
with blob.open("wb", chunk_size=CHUNK_SIZE) as blob_writer:
File "/root/.cache/pypoetry/virtualenvs/pythonproject-9TtSrW0h-py3.11/lib/python3.11/site-packages/google/cloud/storage/fileio.py", line 437, in close
self._upload_chunks_from_buffer(1)
File "/root/.cache/pypoetry/virtualenvs/pythonproject-9TtSrW0h-py3.11/lib/python3.11/site-packages/google/cloud/storage/fileio.py", line 417, in _upload_chunks_from_buffer
upload.transmit_next_chunk(transport, **kwargs)
File "/root/.cache/pypoetry/virtualenvs/pythonproject-9TtSrW0h-py3.11/lib/python3.11/site-packages/google/resumable_media/requests/upload.py", line 503, in transmit_next_chunk
method, url, payload, headers = self._prepare_request()
^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/pypoetry/virtualenvs/pythonproject-9TtSrW0h-py3.11/lib/python3.11/site-packages/google/resumable_media/_upload.py", line 611, in _prepare_request
raise ValueError("Upload has finished.")
ValueError: Upload has finished.
To Reproduce
To reproduce this error, I had to download a sample file.
Hence, I wrapped the script with docker-compose.
Hopefully, this is helpful to reproduce the issue.
https://github.com/devjunhong/large-file-issue
Expected behavior
It should finish uploading without an error.
System (please complete the following information)
- OS version: MacOS 15.1
- Python version: 3.11.10
- gcp-storage-emulator version: v2024.08.03
Metadata
Metadata
Assignees
Labels
No labels