Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
142 changes: 142 additions & 0 deletions docs/architectural-decisions/2026-04-presigned-url-s3-operations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
# ADR: Presigned URL Architecture for S3 Operations

**Status:** Accepted
**Date:** 2026-04-10

## Context

The Hyperspace console proxies all S3 operations through individual Lambda handlers. Each handler fetches Aurora S3 credentials from AWS SSM Parameter Store, executes the S3 operation against the Aurora S3-compatible endpoint, and returns the result to the frontend via API Gateway. These Lambdas run at 1024 MB with provisioned concurrency in production, making them expensive for what is essentially a pass-through.

Upload (`presign-upload.ts`) and download (`download-object.ts`) already use presigned URLs — the Lambda generates a time-limited signed URL and the browser talks to Aurora S3 directly. The remaining operations (ListObjects, HeadObject, GetObjectRetention, DeleteObject) still proxy through Lambda, adding latency and cost for no security benefit.

The current architecture also couples every S3 operation to Aurora-specific Lambda handlers. As the platform prepares to support arbitrary S3-compatible storage providers, the per-operation Lambda pattern becomes harder to maintain — each new provider would multiply the handler count.

### Current Request Path (Proxied)

```
Browser -> CloudFront -> API Gateway -> Lambda -> Aurora S3 -> Lambda -> API Gateway -> CloudFront -> Browser
```

### Desired Request Path (Presigned)

```
Browser -> Lambda (presign, ~50ms) -> Browser -> Aurora S3 (direct)
```
Comment on lines +14 to +24
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with this decision, but want to flag the impact of different network routing on response times. (Latency numbers based on https://latency.bluegoat.net/).

For example, for customers in New Zealand, we are increasing the response times by ~300ms.

Current request path - total ~302ms

  • ap-southeast-6 → us-east-2 (FilOne Console): ~212ms
  • us-east-2 → eu-west-1 (Aurora S3): ~90ms

New request path - total ~508ms

  • ap-southeast-6 -> us-east-2: ~212ms
  • ap-southeast-6 -> eu-west-1: ~296ms

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting numbers.... Will add this to the ADR.

In my experience close to us-west-2, it feels faster. 🤷

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, on second thought, I think my comment is missing the perspective. We are primarily targeting customers in Europe now, and they should see a small improvement in response times.

Current request path - total 181ms

  • eu-central-1 (Germany) → us-east-2: 102ms
  • us-east-2 → eu-west-2: 79ms

New request path - total 122ms

  • eu-central-1 → us-east-2: 102ms
  • eu-central-1 → eu-west-2: 20ms

I think improving the response times for European customers by ~60ms (30%) is meaningful.


## Options Considered

### Browser S3 Client with Temporary Credentials

A single Lambda vends short-lived S3 credentials (`accessKeyId`, `secretAccessKey`, `sessionToken`). The frontend creates an `S3Client` (from `@aws-sdk/client-s3`) in the browser and makes S3 calls directly. One credential fetch covers many operations. The AWS SDK handles XML parsing, error mapping, and pagination natively.

Aurora does not support STS-style session credentials. Its access keys support an `expiresAt` field, but with day-level granularity (YYYY-MM-DD format). These are real persistent keys stored in Aurora's key management system — creating one per browser session would clutter the access key list, require cleanup, and still expose long-lived credentials. Aurora's Token API (`POST /auth/v1/tenants/{tenantId}/tokens`) produces Portal API bearer tokens, not S3 Signature V4 credentials.

Without true short-lived credentials, this approach sends the tenant's long-lived S3 access key and secret key to the browser. Even over HTTPS, the blast radius is unacceptable: a leaked credential (XSS, browser extension, memory dump) grants full S3 access until the key is manually rotated. This option becomes viable if Aurora adds STS support in the future.

### Batch Presigned URL Endpoint

A single `POST /api/presign` Lambda accepts an array of S3 operation descriptors and returns presigned URLs for each. The frontend executes the presigned URLs directly against Aurora S3 and parses the responses. Credentials never leave the backend.

Each presigned URL is scoped to exactly one operation, one bucket/key, and expires in 5 minutes. A leaked URL grants access to a single read or delete — not the entire tenant's S3 namespace. Batching (up to 10 operations per request) reduces round-trips for pages that need multiple S3 calls (e.g., object detail batches HeadObject and, when the bucket has Object Lock enabled, GetObjectRetention in a single presign request).

The main cost is that the frontend must parse S3 XML responses (ListObjects, GetObjectRetention) and HTTP headers (HeadObject). This is handled by a small frontend utility using the browser-native `DOMParser`.

## Decision

Use **batch presigned URLs** via a single `POST /api/presign` endpoint.

### Operations Moved to Presigned URLs

| Operation | HTTP Method | Notes |
| -------------------- | ----------- | ----------------------------------------------------- |
| ListObjectsV2 | GET | Frontend parses S3 XML response |
| HeadObject | HEAD | `fil-include-meta=1` signed into URL for Filecoin CID |
| GetObjectRetention | GET | Frontend parses retention XML |
| GetObject (download) | GET | Consolidates existing `download-object.ts` |
| PutObject (upload) | PUT | Consolidates existing `presign-upload.ts` |
| DeleteObject | DELETE | Presigned URL is the authorization; no CSRF needed |

### Operations Remaining on Lambda

| Operation | Reason |
| ------------ | ---------------------------------------------------------------------------------------- |
| ListBuckets | Aurora Portal REST API (API key auth, not S3 Sig V4) |
| GetBucket | Aurora Portal REST API (returns rich metadata including `objectLockEnabled`, used by the frontend to conditionally include GetObjectRetention in presign batches) |
| CreateBucket | Aurora Portal API mutation |
| DeleteBucket | Aurora Portal API; must verify bucket is empty server-side |

ListBuckets could switch from the Portal API to the S3 `ListBuckets` command (making it presignable), since the handler currently only uses `name`, `createdAt`, `region` (hardcoded), and `isPublic` (hardcoded false). However, the Portal API returns richer metadata that will matter as the UI matures. This can be revisited independently.
Comment thread
bajtos marked this conversation as resolved.

### Presign Endpoint Design

**Route:** `POST /api/presign`

**Middleware:** Auth (JWT cookie) + subscription guard. No CSRF — presigned URLs are themselves the authorization token. The handler inspects the batch to determine access level: if any operation is `putObject` or `deleteObject`, Write access is required; otherwise Read.
Comment thread
bajtos marked this conversation as resolved.
Outdated

**Request:** Array of 1–10 operation descriptors, each a discriminated union on the `op` field (`listObjects`, `headObject`, `getObjectRetention`, `getObject`, `putObject`, `deleteObject`).

**Response:** Array of `{ url, method, expiresAt }` items in the same order as the request, plus the S3 `endpoint` (supports multi-provider routing in the future).

**URL expiry:** 300 seconds, matching the existing `PRESIGN_EXPIRY_SECONDS`.

### HeadObject with Aurora Filecoin Metadata

The current `headObject` handler injects `fil-include-meta=1` as a query parameter via S3 middleware and captures the `x-fil-cid` response header. For presigned URLs, the `fil-include-meta=1` parameter is included in the signing process by attaching the same middleware to the S3Client before calling `getSignedUrl`. The presigner runs the middleware stack, so the parameter becomes part of the signed URL. The frontend reads `x-fil-cid` from the response headers (requires Aurora CORS to expose it via `Access-Control-Expose-Headers`).

### Multi-Provider Architecture

The presign endpoint is designed to support arbitrary S3-compatible providers:

- The `endpoint` field in the response tells the frontend where to execute the URL
- The backend resolves provider and credentials per bucket (today all Aurora, tomorrow per-provider lookup)
- Presigned URLs are provider-agnostic from the frontend's perspective — an HTTP URL with a method
- The frontend S3 response parsers work with any S3-compatible XML format

### Frontend S3 Response Parsing

A new `aurora-s3.ts` module provides browser-native parsers:

- `parseListObjectsResponse` — `DOMParser` on `<ListBucketResult>` XML
- `parseHeadObjectResponse` — reads HTTP response headers
- `parseGetObjectRetentionResponse` — parses `<Retention>` XML
- `parseS3ErrorResponse` — parses S3 error XML (expired URL, not found, access denied)

### Lambda Consolidation

Five handlers are replaced by one:

| Removed | Memory | Provisioned |
| -------------------- | ------- | ----------- |
| `list-objects.ts` | 1024 MB | Yes |
| `head-object.ts` | 1024 MB | Yes |
| `download-object.ts` | default | Yes |
| `presign-upload.ts` | default | Yes |
| `delete-object.ts` | default | No |

| Added | Memory | Provisioned |
| ------------ | ------ | ----------- |
| `presign.ts` | 512 MB | Yes |

## Risks

### Aurora CORS Header Exposure

The Aurora S3 endpoint must expose `x-fil-cid` and `x-amz-meta-*` headers via `Access-Control-Expose-Headers` for HeadObject to work from the browser. Without this, the Filecoin CID and custom metadata are invisible to JavaScript. File upload (PUT) already works, confirming CORS is partially configured. GET, HEAD, and DELETE methods and the specific exposed headers must be verified before deploying the frontend changes. The presign handler can ship independently; only the frontend switch depends on CORS.
Comment thread
bajtos marked this conversation as resolved.
Outdated

### S3 XML Parsing in the Browser

The frontend takes on responsibility for parsing S3 XML responses. Edge cases (empty buckets, special characters in keys, truncated responses, error XML) must be tested. Mitigated by using the browser-native `DOMParser` and writing unit tests for each parser.

### Presigned URL Expiry During Slow Pages

If a user idles on a page and React Query refetches after the presigned URL has expired, the S3 call will return 403. Mitigated by React Query's stale-while-revalidate pattern: the presign + execute is a single `queryFn`, so refetches generate fresh URLs.

## Consequences

- S3 read and delete operations bypass Lambda entirely after the presign call. Latency improves by eliminating the proxy hop (~100–300ms for large payloads).
- Five Lambda handlers are consolidated into one lightweight handler at 512 MB instead of 1024 MB. Provisioned concurrency cost drops proportionally.
- API Gateway data transfer costs decrease — S3 response payloads no longer flow through API Gateway.
- The frontend owns S3 response parsing, adding ~200 lines of parser code that must be maintained.
- CSRF protection is no longer needed for DeleteObject — the presigned URL itself is the authorization token, scoped to one key and expiring in 5 minutes.
- The presign endpoint's `endpoint` response field and provider-agnostic URL execution position the frontend to support multiple S3-compatible providers without structural changes.
- If Aurora adds STS support in the future, the architecture can evolve to vend temporary credentials instead of presigned URLs, eliminating the per-operation presign overhead while keeping the same frontend execution model.
145 changes: 0 additions & 145 deletions packages/backend/src/handlers/delete-object.test.ts

This file was deleted.

91 changes: 0 additions & 91 deletions packages/backend/src/handlers/delete-object.ts

This file was deleted.

Loading
Loading