Skip to content

Latest commit

 

History

History
253 lines (203 loc) · 12.8 KB

File metadata and controls

253 lines (203 loc) · 12.8 KB

S3 storage — developer guide

This is the consolidated developer reference for S3 storage in Transferts. The inline comments in the code are the source of truth for individual mechanisms — this doc gives you the map and the rationale, then points back to the code.

Audience: developers maintaining the backend. If you only want to operate a deployment (IAM, environment variables, dashboards), see the Operational gotchas section.

Table of contents

  1. The model — who talks to whom
  2. Cleanup — what triggers it and how the API handles it
  3. Daily orphan sweep
  4. Operational gotchas
  5. Tests and fixtures

The model — who talks to whom

Two top-level entities own files:

  • TransferDraft — ephemeral upload session. No public-facing metadata (title, recipients…), just a basket of files-in-transit. Lives from the first file drop until the user clicks "Create link" (finalize) or walks away (abort / cleanup).
  • Transfer — the live, shareable transfer. Has metadata, a public token, recipients, expiry. Public download endpoints serve files from here.

TransferFile is attached to exactly one of draft or transfer — enforced by the DB-level transferfile_exactly_one_parent check constraint (core/models.py, class Meta).

        ┌──────────────────┐                 ┌──────────────────┐
        │  TransferDraft   │   finalize      │     Transfer     │
        │  (ephemeral)     │ ─────────────►  │  (live, public)  │
        └────────┬─────────┘                 └─────────┬────────┘
                 │ 1                                   │ 1
                 │                                     │
                 │ 0..N (files in flight)              │ 0..N (live files)
                 ▼                                     ▼
         ┌──────────────────────────────────────────────────────┐
         │  TransferFile  (exactly one parent — DB constraint)  │
         │   s3_key, upload_id, upload_completed_at             │
         └────────────────────┬─────────────────────────────────┘
                              │
                              ▼
                     ┌─────────────────┐
                     │   S3 bucket     │
                     │  (MPU + objects)│
                     └─────────────────┘

The promotion from draft to transfer is a single UPDATE in finalize: the draft's TransferFile rows are reparented (draft=None, transfer=…) and the draft is deleted. The S3 keys do not embed the parent id — only the file id — so they remain valid across the reparenting (viewsets/draft.py::add_file, near the s3_key = … line).

The S3 lifecycle of a single file

  add_file ─────────► CreateMultipartUpload ──► (MPU open, no parts yet)
                                  │
        sign_part / upload_part_bytes (×N, one per chunk)
                                  │
                                  ▼
  complete_upload ──► CompleteMultipartUpload ──► (MPU closed, object materialized)
                                  │
                                  ▼
                      upload_completed_at = now()
                                  │
                                  ▼
                    finalize → reparented to Transfer
                                  │
                                  ▼
                      expiry / deactivation
                                  │
                                  ▼
              deactivate(reason) → PENDING_FILE_DELETION
            (sets pending_deletion_at; object stays in S3)
                                  │
                                  ▼
             delete_pending_transfer_files_task (deferred)
                     → DeleteObject → DEACTIVATED

Deletion is deferred: expiry and deactivation (manual or first-download auto-archive) all funnel through Transfer.deactivate, which only flips the status to PENDING_FILE_DELETION and stamps pending_deletion_at = now + TRANSFER_PURGE_DELAY_HOURS. The bytes stay in S3 until the scheduled delete_pending_transfer_files_task runs past that deadline — the grace window lets recipients' in-flight downloads finish before the object disappears.

Cleanup — what triggers it and how the API handles it

Drafts and partial uploads leave S3 state behind. Cleanup is triggered both by user actions (abort, remove file, walking away) and by errors (DB or S3 failures mid-flight). This section walks through the service API used to handle them and the call sites that wire causes to cleanups.

The two-tier service API

All S3 calls go through core/services/s3.py. The module exposes two tiers, and the choice between them depends on whether the caller can react to a ClientError or just wants to keep going:

Tier Functions Behaviour on ClientError When to use
Bare abort_multipart_upload, delete_object Raises to caller Caller surfaces or reacts to failure
Best-effort best_effort_abort_multipart_uploads_from_files, best_effort_delete_objects_from_files Logs and swallows per item Sweeping over a list where one bad row mustn't stop the rest

The _from_files suffix signals that the helper iterates a queryset/list of TransferFile rows and pulls s3_key / upload_id itself — call sites don't have to map.

Call sites

Six places exercise this API. Each row tells you what triggers the cleanup, which tier is picked, and what the cleanup actually does:

Site Tier Cause Cleanup action
add_file rollback (viewsets/draft.py) bare DB save fails after CreateMultipartUpload Aborts the just-opened MPU (logs if the abort itself fails); re-raises the original DB error to the client.
complete_upload cleanup (viewsets/draft.py) best-effort CompleteMultipartUpload rejected, or size mismatch Aborts MPUs and deletes objects for every file of the draft; deletes the draft row; raises the S3/size error after the atomic block (see flag-then-raise).
remove_file (viewsets/draft.py) best-effort User removes a single file Aborts the file's MPU and deletes its object; deletes the file row; logs any S3 error but does not surface it (the daily orphan sweep catches leftovers).
abort (viewsets/draft.py) best-effort User explicitly aborts the draft Aborts MPUs and deletes objects for every file of the draft; deletes the draft row; returns 204.
cleanup_abandoned_drafts_task (tasks.py) best-effort Draft >24 h old with no finalize/abort Per draft: aborts MPUs and deletes objects for each file; deletes the draft row.
import_drive_file_task cleanup (tasks.py) bare (logged) S3 connection drops mid-stream (Drive) Aborts the file's MPU and deletes its object (idempotent); deletes the file row; the frontend poller notices the row disappeared and surfaces a generic error.

Successful completion

complete_upload stamps upload_completed_at and clears upload_id. finalize then reparents every file from the draft to a fresh Transfer. The draft row is deleted.

Flag-then-raise pattern

complete_upload does not raise from inside its with transaction.atomic() block. Instead it stashes the failure in a local error_detail and raises after the block exits. The reason is that the cleanup branch calls draft.delete() — raising inside the atomic block would roll that delete back, leaving the draft (and its in-flight MPUs) lingering for the next daily orphan sweep to catch. See the comment in viewsets/draft.py::complete_upload near error_detail = None.

Daily orphan sweep

core/services/s3_sweep.py exposes run_orphan_sweep(...). It is invoked from two places:

  • clean_orphan_s3_objects management command — manual operator tool, dry-run by default (pass --apply to actually delete). Also supports --min-age and --prefix flags.
  • sweep_orphan_s3_storage_task Celery beat task — runs daily, calls run_orphan_sweep(apply=True, min_age_hours=24, ...).

Two passes

  1. Objects: list_objects_v2 paginated, cross-referenced against TransferFile.s3_key. Orphans are batched into delete_objects (max 1000 per S3 API call).
  2. MPUs: list_multipart_uploads paginated, keyed on (s3_key, upload_id) rather than upload_id alone — the S3 API allows multiple MPUs per key, and our cross-reference must be exact. Orphans are aborted one by one (no bulk MPU API).

Both passes apply the min-age cutoff (LastModified for objects, Initiated for MPUs). A non-zero count from the daily task logs a warning — that's the signal that one of the per-row cleanup paths leaked.

Why --min-age=24h

add_file opens an MPU before saving the TransferFile row. There's a tiny window between CreateMultipartUpload returning and the row being saved where a sweep with min_age=0 would see the MPU as unknown and abort it. 24 h is well past any legitimate upload (the per-file timeout is much shorter) and clears actual orphans within a day.

Operational gotchas

This section is for anyone deploying or debugging in production.

IAM permissions on the bucket

These are the actual boto3 calls the code makes — the IAM policy on the bucket user must allow all of them:

Permission Used by
s3:ListBucket Sweep (object listing)
s3:GetObject head_object_size, presigned downloads
s3:HeadObject head_object_size (size-mismatch guard in complete_upload)
s3:DeleteObject Per-file cleanup
s3:DeleteObjects Sweep (bulk delete batch)
s3:PutObject Implicitly required for direct PUT presigned URLs and upload_part
s3:AbortMultipartUpload Cleanup paths and sweep
s3:ListMultipartUploadParts Implicit on complete_multipart_upload
s3:ListBucketMultipartUploads Sweep (MPU listing)

Scaleway, AWS, MinIO and Garage all expose these under the same names (modulo the standard AWS naming).

"Empty" MPUs exist

A multipart upload with zero parts is a valid object in list_multipart_uploads — it doesn't bill, but it shows up. That's why the sweep has a dedicated MPU pass instead of relying on the object pass. It's also why assert_bucket_empty (in tests) checks both.

delete_object is idempotent on missing keys

S3 returns 204 for a DeleteObject on a non-existent key. Cleanup paths can be (and are) called on rows whose object was never materialized — no special-casing needed.

Tests and fixtures

S3 tests use moto (in-process AWS mock).

  • tests/_s3_live.py — fixture and helpers. Notably assert_bucket_empty(bucket) checks both objects and MPUs (because of the empty-MPU caveat above).
  • tests/conftest.pylive_s3_bucket fixture, function-scoped, sets up an isolated bucket per test.
  • tests/test_s3_cleanup_drafts.py — abort / abandoned / remove paths.
  • tests/test_s3_cleanup_transfer.py — expiry and deactivation paths.
  • tests/test_s3_cleanup_tasks.py — the Celery tasks themselves.
  • tests/test_s3_cleanup_structural.py — invariants that hold across paths (e.g. "no path leaves a draft with files but no MPUs").
  • tests/test_clean_orphan_s3_command.py — the management command's thin-wrapper layer.
  • tests/test_s3_live_smoke.py — global smoke test against a live bucket, run end-to-end.

moto MPU date caveat

moto hard-codes the Initiated timestamp of MPUs to 2010-11-10 in its list_multipart_uploads response. To exercise the "skip recent MPUs" branch of the sweep we need the moto MPU to register as newer than the cutoff — so test_min_age_skips_recent_orphan_mpu passes --min-age 1000000 (~114 years), placing the cutoff in 1912 so that 2010 reads as "recent". If you write a new test for the cutoff comparison itself, copy that pattern.