Skip to content

Add catalist match download retry logic#1853

Open
sharinetmc wants to merge 2 commits into
ian/patch/catalist-zip-operationfrom
add-catalist=match-download-retry
Open

Add catalist match download retry logic#1853
sharinetmc wants to merge 2 commits into
ian/patch/catalist-zip-operationfrom
add-catalist=match-download-retry

Conversation

@sharinetmc
Copy link
Copy Markdown
Contributor

What is this change?

  • Add catalist match download retry logic

Considerations for discussion

  • (List out any significant design decisions that were made and why.)

How to test the changes (if needed)

  • (How should a reviewer test this functionality.)

Breaking Changes

Breaking changes are changes to our public API which may require existing users to change their code. If there are no breaking changes, any existing parsons user should not need to do anything after updating their parsons version.

Does this PR introduce breaking changes?
  • label: Breaking change — This PR introduces one or more breaking changes.
  • label: Non-breaking change — This PR does not introduce one or more breaking changes.

Details (if needed)

  • (List out any changes to the API that may cause breaks for developer implementation.)

Copy link
Copy Markdown
Contributor

@IanRFerguson IanRFerguson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a code question and a context question

in terms of the code, it feels like we could wrap the function with @backoff.expo and achieve a similar outcome

in terms of context, this seems fine I'm just wondering if we have reason to believe that a retry is going to solve this Catalist issue. probably can't hurt I just want to understand the intent here

Comment on lines +342 to +357
def fetch_zip_with_retry(
self, remote_path: str, job_id: str, max_retries: int = 3, backoff_factor: int = 2
):
last_exception = None

for attempt in range(1, max_retries + 1):
try:
# 1. Attempt the download
temp_file_zip = self.sftp.get_file(
remote_path=remote_path, export_chunk_size=DEFAULT_EXPORT_CHUNK_SIZE
)

# 2. Validate the integrity of the file
if not is_zipfile(temp_file_zip):
size = temp_file_zip.stat().st_size
raise RuntimeError(f"Corrupt zip (Size: {size} bytes)")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why we don't just decorate this function with @backfoff, the retry logic feels kind of manual

Comment on lines +350 to +352
temp_file_zip = self.sftp.get_file(
remote_path=remote_path, export_chunk_size=DEFAULT_EXPORT_CHUNK_SIZE
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will solve your test failure

Suggested change
temp_file_zip = self.sftp.get_file(
remote_path=remote_path, export_chunk_size=DEFAULT_EXPORT_CHUNK_SIZE
)
temp_file_zip = Path(
self.sftp.get_file(
remote_path=remote_path,
export_chunk_size=DEFAULT_EXPORT_CHUNK_SIZE
)
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants