Skip to content

fix: improve file upload resilience#1476

Open
dbolotin wants to merge 1 commit intomainfrom
fix/file-upload-resilience
Open

fix: improve file upload resilience#1476
dbolotin wants to merge 1 commit intomainfrom
fix/file-upload-resilience

Conversation

@dbolotin
Copy link
Member

@dbolotin dbolotin commented Mar 2, 2026

Summary

  • Treat S3 ExpiredToken errors as recoverable during file uploads. When STS session credentials expire during long multipart uploads (800+ chunks), S3 returns HTTP 400 with ExpiredToken. Previously classified as non-recoverable BadRequestError, permanently killing the upload. Now classified as recoverable NetworkError, allowing the retry loop to request a fresh pre-signed URL.

Context

Large file uploads that exceed the STS token lifetime (1-12h depending on AWS config) would fail terminally at an arbitrary chunk with no way to recover. The fix parses the S3 error XML body to distinguish ExpiredToken from other 400 errors.

Test plan

  • Upload a large file (800+ chunks) and verify upload completes successfully
  • Verify that genuine 400 errors (malformed requests) still fail as non-recoverable
  • Verify retry behavior: after ExpiredToken, the next attempt gets a fresh pre-signed URL from the backend

When STS session credentials expire during long multipart uploads,
S3 returns HTTP 400 with ExpiredToken. Previously this was classified
as a non-recoverable BadRequestError, killing the upload permanently.
Now it's classified as a recoverable NetworkError, allowing the retry
loop to request a fresh pre-signed URL from the backend.
@changeset-bot
Copy link

changeset-bot bot commented Mar 2, 2026

⚠️ No Changeset found

Latest commit: 494c966

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the resilience of large file uploads by specifically addressing 'ExpiredToken' errors returned by S3. Previously, these errors would cause uploads to fail permanently, but now they are correctly identified and treated as temporary network issues, enabling the system to refresh credentials and successfully complete the upload process.

Highlights

  • S3 ExpiredToken Handling: Modified file upload logic to classify S3 'ExpiredToken' errors (HTTP 400) as 'NetworkError' instead of 'BadRequestError'. This change makes these errors recoverable during long multipart uploads, allowing the retry mechanism to request fresh pre-signed URLs and continue the upload.
Changelog
  • lib/node/pl-drivers/src/clients/upload.ts
    • Introduced a new helper function, 'isExpiredTokenError', to parse the S3 response body and detect the 'ExpiredToken' code.
    • Modified the 'checkStatusCodeOk' function to differentiate between a generic HTTP 400 'BadRequestError' and an 'ExpiredToken' error. The latter is now reclassified as a 'NetworkError' to trigger retry logic.
    • Refactored the error message construction within 'checkStatusCodeOk' to be more concise and avoid repetition.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly improves file upload resilience by treating S3 ExpiredToken errors as recoverable. However, a critical security concern has been identified: the inclusion of full pre-signed URLs in error messages can leak sensitive authentication tokens into application logs. It is recommended to sanitize these URLs in error message construction. Additionally, while the approach of checking the response body for the specific error code is sound, there is a suggestion to make this check more robust.

Comment on lines +329 to +331
const message =
`response is not ok, status code: ${statusCode},` +
` body: ${body}, headers: ${JSON.stringify(headers)}, url: ${info.uploadUrl}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The error message includes the full info.uploadUrl, which is a pre-signed URL containing sensitive authentication information in its query parameters (e.g., X-Amz-Signature, X-Amz-Security-Token). Logging these secrets can lead to unauthorized access if logs are compromised. Consider sanitizing the URL by removing the query string before including it in the error message.

Suggested change
const message =
`response is not ok, status code: ${statusCode},` +
` body: ${body}, headers: ${JSON.stringify(headers)}, url: ${info.uploadUrl}`;
const message =
`response is not ok, status code: ${statusCode},` +
` body: ${body}, headers: ${JSON.stringify(headers)}, url: ${info.uploadUrl.split('?')[0]}`;

}

function isExpiredTokenError(body: string): boolean {
return body.includes("<Code>ExpiredToken</Code>");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using String.prototype.includes() for this check is a bit fragile because it performs a simple substring search. A regular expression is more robust as it can match the specific XML tag structure, making the check less prone to accidental matches and more resilient to potential (though unlikely) whitespace variations in the S3 error response.

Suggested change
return body.includes("<Code>ExpiredToken</Code>");
return /<Code>ExpiredToken<\/Code>/.test(body);

@codecov
Copy link

codecov bot commented Mar 2, 2026

Codecov Report

❌ Patch coverage is 16.66667% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.36%. Comparing base (69ceb89) to head (494c966).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
lib/node/pl-drivers/src/clients/upload.ts 16.66% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1476      +/-   ##
==========================================
- Coverage   54.47%   54.36%   -0.11%     
==========================================
  Files         242      242              
  Lines       13481    13475       -6     
  Branches     2759     2760       +1     
==========================================
- Hits         7344     7326      -18     
- Misses       5219     5228       +9     
- Partials      918      921       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dbolotin dbolotin enabled auto-merge March 3, 2026 17:08
@dbolotin dbolotin added this pull request to the merge queue Mar 3, 2026
github-merge-queue bot pushed a commit that referenced this pull request Mar 3, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 3, 2026
@dbolotin dbolotin added this pull request to the merge queue Mar 4, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants