-
Notifications
You must be signed in to change notification settings - Fork 49
Remove unnecessary rclone ls call #786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: qa/0.x
Are you sure you want to change the base?
Remove unnecessary rclone ls call #786
Conversation
|
For reference, the rclone mkdir docs |
|
This addresses archivematica/Issues#1743 |
| subprocess.assert_called_with(["mkdir", "testremote:testcontainer"]) | ||
|
|
||
| args, _ = subprocess.Popen.call_args | ||
| assert args[0] == ["rclone", "mkdir", "testremote:testcontainer"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this assertion out of the with block because with pytest.raises() blocks terminate once they encounter the exception, so this assertion wasn't running. I also updated the assertion to validate the call I think we intend to validate here.
|
@liam-lloyd I noticed the linting job failed. If it's of any help in the Archivematica related repositories we have configuration files for pre-commit. So once you install |
ceaaec5 to
c6ea564
Compare
c6ea564 to
03d3807
Compare
In the rclone model's _ensure_container_exists method, rclone ls is used to check whether a container exists on an rclone remote. If not, rclone mkdir is used to create it. However, rclone mkdir returns success without making any changes if the directory already exists, so calling rclone ls and then rclone mkdir if the container doesn't exist is equivalent to just calling rclone mkdir. Further, rclone ls can take quite some time to respond if the remote has a large number of items in its root directory. This commit removes the ls call and updates _ensure_container_exists to simply call mkdir in all cases.
03d3807 to
03f8ada
Compare
|
@liam-lloyd During testing we found this change might potentially break transfer source locations with the following conditions:
I've created a new integration test using MinIO to demonstrate this scenario. Let me break it down:
The test passes in that pull request and if I cherry pick your commit on top of the branch it fails with: DEBUG archivematica.storage_service.locations.models.rclone:rclone.py:117 rclone remote selected: mys3:
DEBUG archivematica.storage_service.locations.models.rclone:rclone.py:74 rclone cmd: ['rclone', 'mkdir', 'mys3:storage-service-browse-935398cc88774cefa9b615dc1565a17f']
DEBUG archivematica.storage_service.locations.models.rclone:rclone.py:75 rclone stdout:
WARNING archivematica.storage_service.locations.models.rclone:rclone.py:77 rclone stderr: 2025/10/17 19:43:15 NOTICE: Config file "/var/archivematica/.config/rclone/rclone.conf" not found - using defaults
2025/10/17 19:43:15 ERROR : Attempt 1/3 failed with 1 errors and: AccessDenied: Access Denied.
status code: 403, request id: 186F75B4177B0870, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8
2025/10/17 19:43:15 ERROR : Attempt 2/3 failed with 1 errors and: AccessDenied: Access Denied.
status code: 403, request id: 186F75B4178FA58E, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8
2025/10/17 19:43:15 ERROR : Attempt 3/3 failed with 1 errors and: AccessDenied: Access Denied.
status code: 403, request id: 186F75B4179D7F65, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8
2025/10/17 19:43:15 Failed to mkdir: AccessDenied: Access Denied.
status code: 403, request id: 186F75B4179D7F65, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8
ERROR archivematica.storage_service.locations.models.rclone:rclone.py:106 Error running rclone command. Command called: ['rclone', 'mkdir', 'mys3:storage-service-browse-935398cc88774cefa9b615dc1565a17f']. Details: ('rclone returned non-zero return code: %s. stderr: %s', 1, '2025/10/17 19:43:15 NOTICE: Config file "/var/archivematica/.config/rclone/rclone.conf" not found - using defaults\n2025/10/17 19:43:15 ERROR : Attempt 1/3 failed with 1 errors and: AccessDenied: Access Denied.\n\tstatus code: 403, request id: 186F75B4177B0870, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8\n2025/10/17 19:43:15 ERROR : Attempt 2/3 failed with 1 errors and: AccessDenied: Access Denied.\n\tstatus code: 403, request id: 186F75B4178FA58E, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8\n2025/10/17 19:43:15 ERROR : Attempt 3/3 failed with 1 errors and: AccessDenied: Access Denied.\n\tstatus code: 403, request id: 186F75B4179D7F65, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8\n2025/10/17 19:43:15 Failed to mkdir: AccessDenied: Access Denied.\n\tstatus code: 403, request id: 186F75B4179D7F65, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8\n')
ERROR archivematica.storage_service.locations.models.rclone:rclone.py:132 Unable to find or create container mys3:storage-service-browse-935398cc88774cefa9b615dc1565a17f
ERROR django.request:log.py:246 Internal Server Error: /api/v2/location/a970ada2-1afc-49b2-994c-5146cf649f43/browse/
Traceback (most recent call last):
File "/src/src/archivematica/storage_service/locations/models/rclone.py", line 94, in _execute_rclone_subcommand
raise StorageException(
archivematica.storage_service.locations.models.StorageException: ('rclone returned non-zero return code: %s. stderr: %s', 1, '2025/10/17 19:43:15 NOTICE: Config file "/var/archivematica/.config/rclone/rclone.conf" not found - using defaults\n2025/10/17 19:43:15 ERROR : Attempt 1/3 failed with 1 errors and: AccessDenied: Access Denied.\n\tstatus code: 403, request id: 186F75B4177B0870, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8\n2025/10/17 19:43:15 ERROR : Attempt 2/3 failed with 1 errors and: AccessDenied: Access Denied.\n\tstatus code: 403, request id: 186F75B4178FA58E, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8\n2025/10/17 19:43:15 ERROR : Attempt 3/3 failed with 1 errors and: AccessDenied: Access Denied.\n\tstatus code: 403, request id: 186F75B4179D7F65, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8\n2025/10/17 19:43:15 Failed to mkdir: AccessDenied: Access Denied.\n\tstatus code: 403, request id: 186F75B4179D7F65, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8\n')@mamedin and I have been testing other approaches to check if the container exists, like the lsjson command passing a |
In the rclone model's _ensure_container_exists method, rclone ls is used to check whether a container exists on an rclone remote. If not, rclone mkdir is used to create it. However, rclone mkdir returns success without making any changes if the directory already exists, so calling rclone ls and then rclone mkdir if the container doesn't exist is equivalent to just calling rclone mkdir. Further, rclone ls can take quite some time to respond if the remote has a large number of items in its root directory. This commit removes the ls call and updates _ensure_container_exists to simply call mkdir in all cases.