Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFR] S3 bugfixes #2329

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

[RFR] S3 bugfixes #2329

wants to merge 8 commits into from

Conversation

npow
Copy link
Contributor

@npow npow commented Mar 3, 2025

This PR fixes several issues which caused s3op to be stuck:

  1. Need to call queue.cancel_join_thread() so that the workers can exit without flushing the queue, otherwise there is a deadlock
  2. Catch the correct exceptions in download/upload (they don't actually raise ClientError)
  3. Handle InternalError
  4. Handle SSLError
  5. Optimistically assume all other unhandled exceptions are transient

Additional improvements:

  1. Added exponential backoff to jitter_sleep()
  2. Set default retry config in s3op.py to match that in aws_client.py
  3. Fix a bug where the retry setting was not being applied properly if config was missing
  4. Fail early on fatal errors
  5. Don't restart from scratch when there's no progress

@npow npow requested review from savingoyal and romain-intel March 3, 2025 22:01
@npow npow added the ok-to-test label Mar 3, 2025
@npow npow added ok-to-test and removed ok-to-test labels Mar 4, 2025
@npow npow added ok-to-test and removed ok-to-test labels Mar 4, 2025
@npow npow added ok-to-test and removed ok-to-test labels Mar 5, 2025
@npow npow changed the title S3 bugfixes [RFR] S3 bugfixes Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant