Description
Hello,
Recently I started using cloudpath library to upload entire directories to S3, but suddenly I noticed a weird behavior that we got a high error rate at S3 (according to AWS monitors) with the following errors:
botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.
botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
And after a deep investigation I have found that those errors came from cloudpath.py::upload_from method since the library checks for each file whether it is exists or not before uploading it (if self.exists() and self.is_dir())
def upload_from(
self, source: Union[str, os.PathLike], force_overwrite_to_cloud: bool = False
) -> "CloudPath":
"""Upload a file or directory to the cloud path."""
source = Path(source)
if source.is_dir():
for p in source.iterdir():
(self / p.name).upload_from(p, force_overwrite_to_cloud=force_overwrite_to_cloud)
return self
else:
**if self.exists() and self.is_dir():
dst = self / source.name**
else:
dst = self
dst._upload_file_to_cloud(source, force_overwrite_to_cloud=force_overwrite_to_cloud)
return dst
My question is:
Why we need this check ? its redundant because self.is_dir() always returns False.