Skip to content

feat(s3): add S3 image upload support for Excalidraw drawings#163

Open
OhYee wants to merge 16 commits into
ZimengXiong:mainfrom
OhYee:feat/s3-image-upload
Open

feat(s3): add S3 image upload support for Excalidraw drawings#163
OhYee wants to merge 16 commits into
ZimengXiong:mainfrom
OhYee:feat/s3-image-upload

Conversation

@OhYee
Copy link
Copy Markdown
Contributor

@OhYee OhYee commented May 6, 2026

Closes #145.

Why

Excalidraw images are stored as base64 dataURLs inside Drawing.files (a JSON column in SQLite). One uncompressed image is several MB; embedding many of them bloats the DB, slows collaborative sync, and makes backups/exports unwieldy. This PR adds opt-in S3-backed image storage. When S3_BUCKET is configured, image dataURLs go to S3 instead of SQLite.

Architecture: backend-driven uploads. The frontend sends the entire drawing payload (base64 dataURLs intact) to the standard create / update / import routes; the backend uploads any new dataURLs to S3 and rewrites them to S3 URLs before persisting. Single auth boundary, uniform coverage of paste / drag / drop / import paths, S3File rows written by the backend in the same handler as the drawing row.

What

Configuration — all optional, S3 stays disabled when unset:

  • S3_BUCKET, S3_REGION (default us-east-1)
  • S3_ENDPOINT (for MinIO / Cloudflare R2 / Alibaba OSS)
  • S3_PUBLIC_URL (CDN base; when unset, falls back to private-bucket redirect mode)
  • S3_FORCE_PATH_STYLE (MinIO needs true; OSS needs false)
  • S3_KEY_PREFIX (default excalidash)
  • AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY

Backend module (backend/src/s3.ts):

  • Typed S3Config, presigned URL helpers, uploadBuffer / listS3Objects / copyS3Object, shared FILE_KEY_PREFIX / buildS3Key / drawingS3Prefix so upload and cleanup paths can't drift.

Backend integration (backend/src/fileProcessing.ts + handler wiring):

  • processFilesForS3(files, userId, drawingId, prisma) scans the file record for base64 dataURLs, uploads them, upserts a row keyed (drawingId, fileId) in S3File, returns a rewritten file record (base64 → S3 URL).
  • Wired into POST /drawings, PUT /drawings/:id, and both /import/* routes.

New routes (backend/src/routes/files.ts):

  • GET /files/config — surfaces s3Enabled for frontend feature detection.
  • GET /files/:drawingId/:fileId — 302 redirects to a presigned GET URL for private-bucket deployments. Authorisation is drawing-scoped (uploader OR has DrawingPermission OR the drawing has an active link-share); anonymous viewers of a link-shared drawing can fetch the image. Closes a read-side info-leak bug where the original code treated fileId as an unguessable capability — but Excalidraw fileId is a SHA-1 of the bytes; anyone holding the original image can compute it.

Schema:

  • S3File table created with composite primary key (drawingId, fileId). Excalidraw fileIds are content hashes, so a global PK on fileId alone would silently overwrite when the same image is uploaded across two drawings. Migration ships the original (single-PK) shape that pre-existing deployments already applied, plus a follow-up 20260506130000_s3file_composite_pk that drops + recreates with the new PK on upgrade. Fresh deployments run both migrations and converge on the same final state.

Sanitiser (security.ts):

  • Validates fileId against ^[\w-]{1,200}$ before persistence so a request with files: { "../../etc/passwd": ... } cannot escape the per-user S3 prefix. Same regex used by the route param.
  • Accepts the new /api/files/:drawingId/:fileId shape as a safe dataURL.

Object key layout: {prefix}/{userId}/{drawingId}/{fileId}.{ext}, Cache-Control: public, max-age=31536000, immutable since fileIds are content hashes.

When S3 is enabled there's no silent base64 fallback: upload failures surface to the user instead of pretending the image was persisted.

Risk

  • Schema is additive (one new table) plus a self-contained upgrade migration; no destructive changes to existing tables. Safe to apply on a populated DB.
  • All new env vars optional; behaviour identical to today when unset.
  • New routes don't change existing API contracts.

Verification

  • npm run build, npm test (all tests pass)
  • Tested against Alibaba Cloud OSS (S3-compatible) end-to-end: paste, drop, save, reload, duplicate, delete all roundtrip correctly with images served from the public bucket URL.

Commit narrative

Each subsequent fix commit is review feedback addressed in-place (kept separate so reviewers can audit each safety/correctness fix independently):

  • fix(s3): validate fileId before using as S3 key segment — defence-in-depth path-traversal guard at both the sanitiser and the upload path
  • fix(files): authorise GET /files/:fileId via drawing access — closes the file-download privilege escalation
  • fix(s3): forward S3_FORCE_PATH_STYLE config to initS3() — env var was parsed but never reached the SDK
  • fix(files): allow link-share viewers to fetch private-bucket images — link-shared drawings can serve their images to anonymous viewers
  • fix(s3): per-(drawing, fileId) S3File rows; share-aware key + URL — schema + composite primary key
  • build(tsc): exclude test files from production tsc — Dockerfile build pipeline
  • fix(s3): add upgrade migration for composite (drawingId, fileId) PK — handles deployments that applied the original migration before the composite-PK shape landed
  • fix(drawings): version pre-check, non-owner gate, DELETE-time S3 cleanup — three coupled safety fixes on the drawings handlers
  • fix(import): pre-resolve drawing id, bound parallel S3 uploads — keeps imported drawings consistent with their uploaded objects, batches concurrency
  • fix(drawings): copy S3 storage on duplicate so the original can be deleted — server-side CopyObjectCommand so duplicates don't share storage with the original
  • fix(drawings): rewrite preview SVG so it doesn't inline base64 after S3 upload — preview field stops carrying megabyte-scale base64 once S3 hosts the image

Copilot AI review requested due to automatic review settings May 6, 2026 09:47
@OhYee OhYee changed the title feat(s3): 为 Excalidraw 图片新增 S3 存储支持 feat(s3): add S3 image upload support for Excalidraw drawings May 6, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

该 PR 为 ExcaliDash/Excalidraw 图片引入可选的 S3(及兼容存储)后端与前端直传能力,在配置 S3_BUCKET 等环境变量时把图片从 SQLite 的 Drawing.files JSON/base64 迁移为对象存储 URL,以降低 DB 体积并改善协作同步性能。(Closes #145

Changes:

  • 后端新增 S3 Client/预签名 URL 工具与文件读取重定向路由,并在启动时按环境变量初始化 S3。
  • Prisma 新增 S3File 表并通过迁移调整为 (drawingId, fileId) 复合主键。
  • 前端 Editor 增加粘贴/拖拽图片的异步上传与保存前等待逻辑;安全清洗器允许存储 S3 URL / /api/files/:drawingId/:fileId 形式。

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
frontend/src/pages/Editor.tsx 监听/拦截图片新增路径,触发直传并在保存前替换为 S3 URL,减少协作广播 base64。
frontend/src/api/index.ts 增加 /files/config 探测、/files/upload-url 获取预签名、以及 PUT 直传封装。
backend/src/s3.ts 新增 S3 初始化与预签名 PUT/GET、public URL 构建、删除对象等工具函数。
backend/src/routes/files.ts 新增 /files/config 与私有桶下载重定向 /files/:drawingId/:fileId 路由。
backend/src/security.ts 加强 files key(fileId)与 dataURL(S3 URL / 私有重定向路径)白名单校验。
backend/src/config.ts 解析 S3 相关环境变量并注入到运行时配置。
backend/src/index.ts 启动时初始化 S3,并注册 files 路由。
backend/prisma/schema.prisma 增加 S3File 模型(复合主键 + 索引)。
backend/prisma/migrations/20260408060000_add_s3_files/migration.sql 初始引入 S3File 表(旧单列主键形态)。
backend/prisma/migrations/20260506130000_s3file_composite_pk/migration.sql 迁移到复合主键形态(drop & recreate)。
backend/tsconfig.json 构建时排除测试文件。
backend/package.json 引入 AWS SDK v3 S3 依赖。
backend/package-lock.json 锁文件随依赖变更更新。
backend/.env.example 增加 S3 配置示例说明。

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/src/routes/files.ts Outdated
Comment on lines +49 to +52
// GET /files/:fileId
// Issues a presigned GET URL and redirects the browser to S3.
// Used only in private-bucket deployments where S3_PUBLIC_URL is not
// set and the dataURL stored in the drawing is "/api/files/:fileId".
: Object.values(filesInput || {});
originalAddFiles(normalizedFiles);

if (isSyncing.current) return;
Comment on lines +1 to +5
/**
* S3 file routes:
* GET /files/config – report whether S3 is configured
* GET /files/:drawingId/:fileId – redirect to a presigned S3 GET URL
* (private-bucket mode)
Comment on lines +55 to +59
"/files/:drawingId/:fileId",
optionalAuth,
asyncHandler(async (req, res) => {
if (!isS3Enabled()) {
return res.status(501).json({ error: "S3 storage is not configured" });
Comment thread backend/src/security.ts Outdated
Comment on lines +539 to +540
} else if (/^https:\/\//i.test(value)) {
// S3 / CDN public URL — validate format, no HTML injection risk.
Comment thread frontend/src/pages/Editor.tsx Outdated
Comment on lines +858 to +862
})().catch((err) => {
console.error("[Editor] S3 upload failed", err);
toast.error(`Image upload failed: ${err instanceof Error ? err.message : "unknown error"}`);
throw err;
});
Comment thread frontend/src/pages/Editor.tsx Outdated
Comment on lines +1718 to +1722
})().catch((err) => {
console.error("[Editor] S3 upload failed", err);
toast.error(`Image upload failed: ${err instanceof Error ? err.message : "unknown error"}`);
throw err;
});
Comment thread frontend/src/pages/Editor.tsx Outdated
Comment on lines +1074 to +1078
if (result.status === "fulfilled") {
const { fid, url } = result.value;
updated[fid] = { ...updated[fid], dataURL: url };
delete pendingS3UploadsRef.current[fid];
}
Comment thread frontend/src/api/index.ts
Comment on lines +689 to +704
/** Cache the result of the /files/config probe so we only call it once. */
let s3EnabledCache: boolean | null = null;

/**
* Returns true when the backend has S3 configured.
* The result is cached for the lifetime of the page.
*/
export const isS3Enabled = async (): Promise<boolean> => {
if (s3EnabledCache !== null) return s3EnabledCache;
try {
const response = await api.get<{ s3Enabled: boolean }>("/files/config");
s3EnabledCache = response.data.s3Enabled === true;
} catch {
s3EnabledCache = false;
}
return s3EnabledCache;
Comment thread frontend/src/api/index.ts Outdated
Comment on lines +709 to +713
* Returns { uploadUrl, accessUrl }:
* uploadUrl – browser should PUT the raw file bytes here (directly to S3)
* accessUrl – URL to store as dataURL in the drawing; either a public S3/CDN
* URL or an "/api/files/:fileId" path (private bucket).
*/
OhYee and others added 12 commits May 6, 2026 18:18
Excalidraw images are stored as base64 dataURLs inside Drawing.files
(a JSON column in SQLite). One uncompressed image is several MB;
embedding many of them bloats the DB, slows collaborative sync, and
makes backups/exports unwieldy. This commit adds opt-in S3-backed
image storage. When S3_BUCKET is configured, image dataURLs go to
S3 instead of SQLite.

Architecture: backend-driven uploads. The frontend sends the entire
drawing payload (base64 dataURLs intact) to the standard create /
update / import routes; the backend uploads any new dataURLs to S3
and rewrites them to S3 URLs before persisting. Single auth boundary,
uniform coverage of paste / drag / drop / import paths, S3File rows
written by the backend in the same handler as the drawing row.

Configuration (all optional, S3 stays disabled when unset):
  S3_BUCKET, S3_REGION (default us-east-1)
  S3_ENDPOINT (MinIO / Cloudflare R2 / Alibaba OSS)
  S3_PUBLIC_URL (CDN base; unset → private-bucket redirect mode)
  S3_FORCE_PATH_STYLE (MinIO needs true; OSS needs false)
  S3_KEY_PREFIX (default "excalidash")
  AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY

Backend module (backend/src/s3.ts):
- Typed S3Config, presigned URL helpers, uploadBuffer / listS3Objects
  utilities, shared FILE_KEY_PREFIX / buildS3Key / drawingS3Prefix so
  upload and cleanup paths can't drift onto different prefixes.

Backend integration:
- backend/src/fileProcessing.ts: processFilesForS3(files, userId,
  drawingId, prisma) scans the file record for base64 dataURLs,
  uploads them, upserts a row keyed (drawingId, fileId) in S3File,
  returns a rewritten file record (base64 → S3 URL).
- Wired into POST /drawings, PUT /drawings/:id, and both /import/*
  routes so every code path that introduces a new image is covered.

New routes (backend/src/routes/files.ts):
- GET /files/config: surfaces s3Enabled for frontend feature
  detection.
- GET /files/:fileId: 302 redirects to a presigned GET URL for
  private-bucket deployments where the dataURL stored on the drawing
  is /api/files/:fileId.

Schema (backend/prisma/schema.prisma + initial migration):
- New S3File table mapping fileId → s3Key + userId + mimeType so the
  redirect route can resolve the bucket location. Additive only; no
  destructive changes to existing tables.

Sanitiser (backend/src/security.ts):
- Permits the new "https://..." (CDN) and "/api/files/..."
  (redirect) shapes as safe dataURL values without otherwise
  loosening the existing image-dataURL validation.

Object key layout: {prefix}/{userId}/{drawingId}/{fileId}.{ext},
served with Cache-Control: public, max-age=31536000, immutable since
fileIds are content hashes and the objects are never mutated in
place.

When S3 is enabled there is no silent base64 fallback: upload
failures surface to the user instead of pretending the image was
persisted.

Change-Id: I189e27b1f5459a1782e4a69d1f8c4494dee3921e
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reject file ids that don't match /^[\w-]{1,200}$/ in two layers, so a
request body like
  files: { "../../etc/passwd": { dataURL: "data:image/png;base64,..." } }
cannot escape the per-user S3 prefix and write outside it:

- security.ts sanitizeDrawingData drops malformed keys before
  persistence, so the drawing.files JSON cannot store entries that
  would later be uploaded under attacker-controlled paths.
- fileProcessing.processFilesForS3 also drops them defensively, since
  this utility is reachable from import paths that bypass the schema.

Same regex used by the GET /files/:fileId route param. Adds a
regression test covering the path-traversal case end-to-end.

Change-Id: I6c8427b4ab67956a57beb262c182a539a73a14f1
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…uessability

Excalidraw fileIds are SHA-1 hashes of the file bytes — anyone in
possession of the original image can compute the id, so the previous
"fileId is unguessable" comment was incorrect and any authenticated
user could fetch images uploaded by anyone else.

Now the route authorises by drawing access:
- the uploader (S3File.userId === req.user.id) can always read; OR
- the caller has access to a drawing that references this fileId, either
  one they own or one explicitly shared with them via DrawingPermission.

A single substring query on the JSON columns is enough because fileIds
are >= 16 random chars; the wrapping quotes prevent prefix collisions
(`abc1` won't match `abc12`).

Also adds a 401 fallback in case requireAuth is ever removed.

Change-Id: I4c9c0640e39d8d6a791009db289dc3a29e6658df
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: OhYee <oyohyee@oyohyee.com>
S3_FORCE_PATH_STYLE was parsed in config.ts and surfaced on the typed
S3Config but the property was never threaded through to the initS3()
call site. MinIO/Alibaba OSS users who set the env var saw no effect —
the S3 client always picked the default addressing style.

Change-Id: If16d6d8326612d4fb48501a54fea35c2b1e62d1b
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: OhYee <oyohyee@oyohyee.com>
GET /files/:fileId previously required a logged-in user via requireAuth,
which broke private-bucket deployments whenever a drawing was shared
with an active link-share: the anonymous viewer could open the
drawing scene but every <img src="/api/files/..."> returned 401.

Replace requireAuth with optionalAuth and decide access by drawing
membership instead:

- The uploader (S3File.userId === caller) always has access.
- Authenticated callers also pass if they own a drawing referencing
  the fileId, or were granted access via DrawingPermission.
- Anyone (anonymous included) passes if a drawing referencing the
  fileId has an active DrawingLinkShare row (revokedAt IS NULL and
  expiresAt either NULL or in the future).
- Otherwise return 404 so the response doesn't leak whether a fileId
  exists for someone who doesn't have access.

Change-Id: Id8e6f427d9bd392a0adbefd931a1b219c7b35513
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
S3File previously used the global Excalidraw fileId as its sole
primary key. Excalidraw fileIds are content hashes that legitimately
repeat across drawings, so two drawings — possibly two different
users — uploading the same image silently overwrote each other's
row. /files/:fileId then couldn't disambiguate which (drawing, owner)
the caller meant.

Move S3File to a composite primary key (drawingId, fileId), with
matching changes throughout the upload + access paths:

- prisma schema + initial migration: PK is now @@id([drawingId,
  fileId]) with @@index([drawingId]).
- s3.ts: shared FILE_KEY_PREFIX + buildS3Key() + drawingS3Prefix()
  helpers so upload, list, and cleanup paths cannot drift onto
  different prefixes.
- fileProcessing.ts: composite-key upsert; access URL is
  /api/files/:drawingId/:fileId.
- routes/files.ts: route signature is /files/:drawingId/:fileId; the
  lookup uses { drawingId_fileId } and the access decision is
  drawing-scoped (owner / explicit grantee / active link-share).
- security.ts: the safe-dataURL pattern accepts the new path shape.
- Test mock + assertions updated.

The DELETE-time S3 cleanup in routes/dashboard/drawings.ts still
needs a follow-up to switch from prefix-startswith to drawingId
filter; that change rides with the next commit so the upload-side
schema change can be reviewed independently.

Change-Id: Idccf506bf81b448d29dc75cb24bb3e49481e3279
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Dockerfile production stage runs `npx tsc` to emit dist/. The same
tsc invocation was also compiling the __tests__/ tree, which made the
build fail whenever a test mock used a partial type (e.g. a Zod schema
mock with only safeParse). vitest has its own type-checking pass, so
the production build has no business compiling test files.

Exclude *.test.ts, *.integration.ts, and the __tests__/ folder. The
production tsc now succeeds even when test files have type
shortcomings; vitest still catches real test errors at run time.

Change-Id: Ica2d5f91c05131b3d9b6674df61e34fcbda60884
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Prior versions of this branch shipped the new S3File composite primary
key by editing the original `20260408060000_add_s3_files` migration in
place. That works for fresh deployments but is invisible to any
deployment that already applied the original `id`-only PK migration —
Prisma decides what to apply by migration name, so the new shape never
reaches the database and runtime upserts crash with
"The column `drawingId` does not exist in the current database."

Restore the original migration to its pre-bundle shape and add a
follow-up migration that drops + recreates S3File with the new shape.
Both fresh and upgrade paths converge on the composite PK. Existing
S3 objects are untouched; public-bucket deployments are unaffected.
Private-bucket deployments will see /api/files/:drawingId/:fileId 404
for legacy rows until those drawings are re-saved (the save flow
upserts on every base64 dataURL); operators with valuable legacy rows
can rebuild manually before deploying.

Change-Id: I9c06c7a46247050521a1d5013ec37eeebeb86247
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PUT /drawings/:id (scene update path):

- Version conflict detection now happens BEFORE processFilesForS3, not
  after the updateMany. Previously, on a stale-version save the upload
  had already PUT every base64 file to S3 and we returned 409 — the
  objects became permanent orphans with no DB row. The new check uses
  a cheap findFirst before the upload; the existing version-filtered
  updateMany still catches the residual race window.
- Non-owner editors (link-share + edit, share-with-user) can no longer
  introduce new fileIds. Reading the existing files JSON gives us the
  set of allowed ids; any unknown id in the payload returns 403. Owners
  retain full control. This prevents a shared editor from filling the
  owner's S3 bucket via 50MB uploads.

DELETE /drawings/:id:

- Best-effort cleanup of S3 objects and S3File rows under the per-
  drawing prefix. Failures log and continue; the drawing row is
  already gone, so the cleanup is opportunistic. Without this, deleting
  a drawing leaked every uploaded image permanently.

Change-Id: Ia684b408c1eb0c474c04bf8c04eb339f9a497e47
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: OhYee <oyohyee@oyohyee.com>
Two import flows (excalidash backup + legacy SQLite) both pre-uploaded
files to S3 using the imported drawing id, then on a foreign-owner id
collision wrote the row under a fresh uuidv4() — leaving every uploaded
object as a permanent S3 orphan keyed under an id that no DB row knew
about.

Now both routes pre-resolve the final drawing id before uploading by
peeking at any pre-existing drawing of the same id; if it belongs to
another user, we generate the UUID up front and feed it to both the
upload and the in-transaction create. The transaction still re-checks
authoritatively, but always writes under the same finalId.

Also replaces the per-drawing serial await with bounded Promise.all
batches of 8. A 5000-drawing import previously serialised every S3 PUT
inside the request, easily hitting reverse-proxy timeouts; the bound
keeps memory and socket pressure flat without saturating the bucket.

Change-Id: I7ae77f642f097c295c71771d59ecea0d045661d3
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: OhYee <oyohyee@oyohyee.com>
…leted

POST /drawings/:id/duplicate previously copied elements/appState/files
verbatim, including the dataURLs that point at the *original*
drawing's S3 prefix (`/api/files/origId/fileId` or
`https://.../excalidash/owner/origId/fileId.ext`). The duplicate's
images therefore pointed at storage owned by the original. Deleting
the original then ran the new prefix-scoped S3 cleanup and silently
broke every image in the duplicate.

Now the duplicate handler does a server-side CopyObject for every
S3File row of the original under the new drawingId path, writes a
fresh (newDrawingId, fileId) S3File row, and rewrites the dataURLs
in the new drawing's files JSON to match the new drawing id. Each
drawing — original and duplicate — owns its own storage from the
moment of duplication, so DELETE /drawings/:origId can no longer
strand the duplicate.

Adds a bucket-internal copyS3Object() helper to s3.ts (uses S3
CopyObject; no download/re-upload).

Change-Id: If57d006fce8964fe04e702680e96866a3ec3b6f0
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…S3 upload

Symptom: every save produced a megabyte-scale `Drawing.preview`
column. The SVG had `<image href="data:image/png;base64,...">` for
each image element instead of the S3 URL.

Root cause: the frontend generates the preview SVG from the canvas
state at save time, before the round-trip uploads files to S3. The
SVG embeds whatever dataURL the file currently has, which at save
time is still base64. processFilesForS3 only rewrites the dataURL
inside `Drawing.files`, never the parallel preview SVG, so the
inlined base64 stays in the preview forever.

Add a rewritePreviewForS3 helper in fileProcessing.ts that diffs the
original-vs-processed files map and applies the resulting URL
substitutions to the preview string. Wire it into the POST and PUT
handlers so the preview saved alongside the upload reflects the same
S3 URLs as Drawing.files.

Best-effort string substitution: it works because the same dataURL
string is character-identical in both `files[fileId].dataURL` and the
preview SVG's href. If frontend encoding ever diverges, the worst
case is the preview is left as-is — never crashes.

Change-Id: I427edabcccc1453bf0f69f5dde33c9648cfcb370
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@OhYee OhYee force-pushed the feat/s3-image-upload branch from 92b08bb to 3e48f5d Compare May 6, 2026 10:21
OhYee and others added 4 commits May 6, 2026 18:32
The duplicate handler used to "log and continue" when copyS3Object
threw — the duplicated `Drawing.files` JSON kept the original
drawing's dataURL for that fileId, the new drawing landed pointing at
the original's S3 prefix, and the next deletion of the original
silently broke the duplicate's images.

Restructure the flow so the new drawing row is only created after
every S3 copy succeeds:

1. Pre-allocate the new drawing id with crypto.randomUUID().
2. Copy every S3 object to the new prefix; track the destination
   keys in a list as we go.
3. If any copy throws, delete every key we managed to copy so the
   bucket isn't left with orphan objects, and return 500. The
   original drawing is never touched.
4. Only after all copies succeed, open a single Prisma transaction
   that creates the drawing row and the S3File rows together — so a
   DB failure can't leave S3 objects with no row, and the new
   drawing is always observable in a fully-consistent state.

Change-Id: Ic930d74a3831ce325aaf843f3dd667d3ddeafbe9
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…flict

The earlier fix pre-resolved a final drawing id before uploading to
S3 so the upload key and the eventual DB row would match. That works
for the common case, but if a drawing with the same id is created
between the pre-check and the in-transaction findUnique (race), the
in-tx tx.drawing.create({ id: finalId }) hits a unique-constraint
error and aborts the whole import — so a stray race kills the whole
batch.

In the conflict branch (existing.userId !== req.user.id), check
whether the pre-resolved finalId still equals the imported id: if so,
the pre-upload check missed the race and we'd be writing under a
guaranteed conflict. Generate a fresh UUID for the row instead. The
S3 objects we already PUT under the imported-id path become orphans
that the storage-management tools can sweep up; that's vastly better
than failing the whole batch.

Applied to both /import/excalidash and /import/sqlite/legacy.

Change-Id: Ie1997da1511b736cee33b03e19ae6e06662df1f4
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous shape of 20260506130000_s3file_composite_pk did
DROP TABLE + CREATE TABLE, which silently lost every existing
(fileId → s3Key, mimeType, userId) mapping. For private-bucket
deployments that turns every /api/files/:drawingId/:fileId into a
404 until each affected drawing is re-saved.

Replace the drop with a copy: extract the drawingId from each
s3Key (which has shape `{prefix}/{userId}/{drawingId}/{fileId}.{ext}`)
by chaining INSTR + SUBSTR three times — SQLite has no native
split() and the bundled engine doesn't have REVERSE either, so we
walk the string from the start. Rows whose s3Key doesn't have at
least three '/' are skipped (they can only come from a custom
S3_KEY_PREFIX containing '/', which would otherwise misattribute the
drawingId; operators in that case still need to re-save).

INSERT OR IGNORE protects against duplicate (drawingId, fileId)
pairs theoretically allowed under the old schema; the survivor wins.

Change-Id: I69a27a5deabdcc43b7bfdcc0b1445d7dcb9fd4db
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dy docs

Round-up of review feedback that doesn't change behaviour for the
common case but plugs operability/correctness gaps:

- security.ts sanitiser:
  * accept `http://` in addition to `https://` for S3-style URLs, so
    a plain-HTTP local MinIO `S3_PUBLIC_URL` doesn't get blanked on
    save and break image loading
  * snapshot the file-id list with `Object.keys()` before deleting
    invalid ids — `for...in` while mutating can skip entries on some
    engines, leaving an unsafe key in the sanitised output
- fileProcessing.processFilesForS3: bound parallel S3 PUTs to a
  concurrency of 8. Without this, a paste of N images fired N
  parallel uploads, which can saturate S3 connection pools and
  produce inconsistent partial-failure states on flaky networks.
  Imports were already bounded; this brings the save path in line.
- routes/files.ts: header comment block now reflects the actual
  /files/:drawingId/:fileId route shape and the
  /api/files/:drawingId/:fileId redirect path.
- backend/.env.example: rewrites the S3 section to describe the
  current backend-driven upload flow (no more browser-direct PUT,
  no more CORS rules), and documents S3_FORCE_PATH_STYLE,
  S3_KEY_PREFIX, plus the http://-publicUrl allowance for local
  MinIO.
- frontend Editor.tsx: drop `initialFileIdsRef`. It was only read by
  the Excalidraw 0.18 paste-detection block that the backend-direct
  refactor removed, so the ref now just leaks state.

Change-Id: I95b8cd0480845aaceeea37aa5ede3e76bd8c479b
Co-developed-by: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@OhYee
Copy link
Copy Markdown
Contributor Author

OhYee commented May 6, 2026

Thanks @copilot. Addressed each item below; new commits keep the per-suggestion separation so each fix can be reviewed independently:

Critical

  • fix(drawings): roll back duplicate when an S3 copy fails — duplicate handler now pre-allocates the new id, copies all S3 objects first, deletes any partial copies and returns 500 on failure, and only then opens a single transaction that creates the row + S3File entries together. The original drawing is never touched.
  • fix(import): regenerate drawing id inside the transaction on race conflict — both import routes detect the pre-upload→tx race and generate a fresh id rather than letting tx.drawing.create hit a unique-constraint and abort the whole batch. The straggler S3 objects under the imported-id path become orphans the storage tools can sweep up.
  • fix(s3): preserve S3File rows during composite-PK migration — replaces the unconditional DROP with a copy: extracts drawingId from each existing s3Key by chaining INSTR/SUBSTR (SQLite has no split() and the bundled engine has no REVERSE), so private-bucket deployments keep their lookups. Rows whose s3Key doesn't have at least three '/' (i.e. a custom S3_KEY_PREFIX with internal '/') are skipped — those operators still need to re-save.

High / cleanup

  • fix(s3): http:// publicUrl, bounded uploads, snapshot-then-mutate; tidy docs
    • sanitiser: accept http:// (local MinIO S3_PUBLIC_URL) in addition to https://
    • sanitiser: snapshot file-id keys with Object.keys() before deleting invalid ones — for...in while mutating can skip entries on some engines
    • processFilesForS3: bound the parallel S3 PUTs to concurrency 8 (paste-of-N-images was firing N parallel uploads and saturating the connection pool)
    • routes/files.ts: header comment block now matches the actual /files/:drawingId/:fileId route
    • .env.example: rewrote the S3 section to describe the backend-direct flow (no presigned PUT URLs, no CORS rules), documented S3_FORCE_PATH_STYLE and S3_KEY_PREFIX, and noted that http:// is allowed for local MinIO
    • Editor.tsx: dropped initialFileIdsRef (only read by the Excalidraw 0.18 paste-detection block that the backend-direct refactor removed)

Skipped: dedicated integration tests for GET /files/:drawingId/:fileId access matrix (owner / explicit grantee / link-share / 404). That's a meaningful test-writing effort and I'd rather queue it as a follow-up than balloon this PR.

RawSalmon69 added a commit to RawSalmon69/HomeLab-Terraform that referenced this pull request May 25, 2026
… zstd

ExcaliDash was painfully slow over the home's asymmetric uplink
(0.5–2 Mbps after r8169 SG/TSO/GSO fix). Three issues found:

1. `/api/drawings` list returned 12.6 MB of inline SVG previews (issue #59).
2. Drawing files column stored 13.9 MB of base64 images in SQLite — every
   open/save shipped 2–18 MB over the tunnel (issue #145).
3. No response compression in Caddy.

Changes:

- caddy/Caddyfile: add `encode zstd gzip` to (common) → 4× smaller JSON.
- excalidash-backend: switch to custom image `excalidash-backend:s3`
  built from PR ZimengXiong/ExcaliDash#163 (S3 image upload). Wired to
  Cloudflare R2 bucket `excalidraw` via `excalidraw-bucket.phanthawas.dev`
  custom domain so images load from CF edge, bypassing the home uplink.
- excalidash/patches/drawings.js: bind-mounted override that drops
  `preview` from `summarySelect` in both `/drawings` list endpoints.
  Frontend lazy-fetches preview per card. List response 12.6 MB → ~2 KB.
- .env.example: document R2/S3 env var schema (values live in box .env).
- .gitignore / glance / hermes / amp / sync / conflux: bundled pre-existing
  local edits made during the same investigation.

Existing 26.6 MB of base64 in SQLite migrated to R2 via one-shot script
(reused backend's processFilesForS3 + rewritePreviewForS3). VACUUM after:
DB 39 MB → 848 KB.

Also fixed on box (config not in repo):
- /etc/systemd/system/nic-offload.service: persistent ethtool -K enp3s0
  sg/tso/gso on (r8169 ships these off, killing TCP uplink → 4× boost).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

S3 Object Storage for Images

2 participants