Skip to content

Add Idempotency-Key header support for upload creation requests#1366

Open
m7kvqbe1 wants to merge 9 commits intotus:mainfrom
m7kvqbe1:idempotency-key
Open

Add Idempotency-Key header support for upload creation requests#1366
m7kvqbe1 wants to merge 9 commits intotus:mainfrom
m7kvqbe1:idempotency-key

Conversation

@m7kvqbe1
Copy link
Copy Markdown

Summary

  • Add support for the Idempotency-Key header on upload creation (POST) requests, allowing clients to safely retry without creating duplicate uploads
  • Introduce an IdempotencyKeyStore interface with two implementations: file-based (for disk storage backends, persists across restarts) and in-memory (for cloud storage backends)
  • Handle interrupted concatenation state: retry if offset is 0, return existing if complete, error if corrupted (0 < offset < size)
  • Wire up the appropriate store automatically in the CLI based on the active storage backend

Motivation

Closes #1364

Upload creation requests (particularly concatenation) are not currently idempotent. When a concat request is interrupted (e.g. by a WAF timeout), the client never receives the response and retries, creating a duplicate final upload.

This implements the approach suggested by @Acconut in #1364: a general-purpose Idempotency-Key header that works for all upload creation requests, not just concatenation. For the filestore backend, key-to-upload-ID mappings are persisted to disk alongside upload data (as .idempotency-key files), so they survive server restarts.

See also #1365 for a previous concat-only approach.

Design

New interface (IdempotencyKeyStore) added to StoreComposer, following the same pattern as Locker:

type IdempotencyKeyStore interface {
    FindUploadID(ctx context.Context, key string) (string, error)
    StoreUploadID(ctx context.Context, key string, uploadID string) error
}

Two implementations shipped:

Backend Store Durability
filestore fileidempotencystore (disk) Survives restarts
S3/GCS/Azure memoryidempotencystore (memory) Within process lifetime

Handler logic (in PostFile):

  1. Read Idempotency-Key header
  2. If store configured and key present: look up existing upload
  3. If found and complete: return existing (201 + Location)
  4. If found and incomplete concat (offset==0): retry concatenation
  5. If found and corrupted concat (0 < offset < size): return 500
  6. If found and non-concat: return existing upload info for PATCH resume
  7. If not found: create normally, store key->ID mapping

Opt-out: --disable-idempotency CLI flag.

Test plan

  • TestIdempotency/ConcatRetryComplete -- retried concat with already-complete upload returns existing
  • TestIdempotency/ConcatRetryIncomplete -- retried concat with offset==0 retries the concatenation
  • TestIdempotency/ConcatRetryCorrupted -- retried concat with partial data returns error
  • TestIdempotency/RegularUploadRetry -- retried regular upload returns existing with offset
  • TestIdempotency/NoStoreConfigured -- header ignored when no store configured
  • TestIdempotency/NoHeaderSent -- normal flow when no header
  • TestIdempotency/DeletedUploadFallsThrough -- stale mapping to deleted upload creates new upload
  • TestFileIdempotencyStore -- file-based store unit tests
  • All existing handler tests continue to pass
  • Full go test ./pkg/... passes

Introduce the IdempotencyKeyStore interface for persisting mappings from
client-provided Idempotency-Key header values to upload IDs, enabling
detection of retried upload creation requests.

Add the interface to StoreComposer following the same pattern as Locker,
with UseIdempotencyKeyStore() and Capabilities() support.

Ref: https://www.ietf.org/archive/id/draft-ietf-httpapi-idempotency-key-header-07.html
Persist idempotency key to upload ID mappings as JSON files on disk,
using {sha256(key)}.idempotency-key filenames. Stores the full original
key inside the file to guard against hash collisions.

Follows the filelocker pattern: New(path), UseIn(composer), and the
same directory can be shared with upload data.
In-memory store for cloud storage backends (S3, GCS, Azure) where no
local disk is available. Mappings are lost on process restart but still
protect against duplicate uploads within a single server lifetime.
When a client sends an Idempotency-Key header with a POST request and
an IdempotencyKeyStore is configured, the handler detects retried upload
creation requests:

- If the key maps to a completed concat upload: return it immediately
- If the key maps to an incomplete concat (offset==0): retry concat
- If the key maps to a corrupted concat (0 < offset < size): return error
- If the key maps to a non-concat upload: return it for PATCH resume
- If no mapping exists: create normally and store the key->ID mapping

Also adds Idempotency-Key to CORS AllowHeaders and updates CORS tests.
Configure the appropriate IdempotencyKeyStore for each storage backend:
- filestore: file-based store (persists to same directory as uploads)
- S3/GCS/Azure: in-memory store (lost on restart, but protects within
  a single server lifetime)

Add --disable-idempotency flag to opt out of Idempotency-Key support.
Cover the key scenarios:
- Concat retry when already complete (returns existing)
- Concat retry when incomplete at offset 0 (retries concat)
- Concat retry when corrupted (returns error)
- Regular upload retry (returns existing for PATCH resume)
- No idempotency store configured (header ignored)
- No header sent (normal flow)
- Deleted upload falls through to new creation
@Murderlon
Copy link
Copy Markdown

Review by Devin: https://app.devin.ai/review/tus/tusd/pull/1366

os.WriteFile uses O_TRUNC, so a crash between truncation and completed
write leaves an empty or partial JSON file. FindUploadID then returns a
raw json.Unmarshal error, which the handler treats as a hard 500 —
permanently breaking that idempotency key.

Fix both the cause and the symptom:
- StoreUploadID now writes to a .tmp file then renames atomically
- FindUploadID treats corrupted JSON as ErrNotFound (cache miss) so the
  handler falls through to create a new upload and overwrites the file
emitFinishEvents increments UploadsFinished metrics and sends to
CompleteUploads on every call. For an idempotent replay of an
already-completed concat upload, hooks and metrics should not fire
again — consistent with the non-concat replay path which correctly
skips them.

Only run PreFinishResponseCallback so callers can still inject
response headers on replay. Update the ConcatRetryComplete test to
no longer expect a spurious CompleteUploads event.
Add DirModePerm and FileModePerm fields to FileIdempotencyStore,
matching the filestore pattern. Fall back to 0775/0664 defaults when
zero. Wire the CLI flags through so .idempotency-key files are created
with the same permissions as upload data files.
@m7kvqbe1
Copy link
Copy Markdown
Author

Review by Devin: https://app.devin.ai/review/tus/tusd/pull/1366

Thanks @Murderlon.

I've added some fixups to this PR based on Devin's review:

@Murderlon
Copy link
Copy Markdown

cc @Acconut

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Concatenation operation is not idempotent, causing duplicate uploads on retried requests

2 participants