Skip to content

Commit 74be559

Browse files
DaviRain-Suclaude
andcommitted
Fix audit backlog: upvotes, library sort, job double-run, 简繁, perf split, tests
User-visible bugs: - Upvotes always 0: feed + comments now merge live votes-table COUNT(*) (functions/routes/social.ts), not the never-incremented up column. - Dead library sort: listBooks computes readsN (distinct readers from progress) and liners (highlights), sorts in SQL, and exposes createdAt (functions/lib/catalog.ts). - Platform job double-run: runPlatformJob claims atomically (WHERE status!='running' OR started_at<stale); runDuePlatformJobs also sweeps stale 'running' jobs so a crashed run is recovered, not lost (functions/lib/platform.ts). - 简→繁 wrong chars: ambiguous chars default to the common reading (后→後, 里→裡, 余→餘) with exception phrases (皇后/公里/头发), applied via a placeholder pass so phrase output survives the char map (src/lib/zh-convert.js). Perf: - @mysten/sui + @simplewebauthn dynamic-imported in the sign-in/subscribe handlers → first-paint main chunk ~91KB → ~57KB gzip (onboarding/cli-auth/profile). Tests: - Extract money/auth checks to functions/lib/verify.mjs (Stripe HMAC, Sui payment match, constant-time compare, nonce binding); behavioral tests in test/verify.test.mjs — previously zero executed-code coverage. docs/AUDIT.md records the full audit + remaining backlog. Adversarially reviewed; the review's one finding (crash-recovery regression in the job claim) is fixed above. typecheck + build + tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 387c5b8 commit 74be559

13 files changed

Lines changed: 378 additions & 91 deletions

File tree

docs/AUDIT.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Liber — Code Audit (2026-05-31)
2+
3+
A multi-dimension audit (security, correctness, architecture, data/migrations,
4+
tests/CI, ops, product/UX/a11y, performance) of the Cloudflare Pages app. Each
5+
finding was grounded in real code and adversarially re-verified. Severities are
6+
the post-verification ratings.
7+
8+
## Fixed
9+
10+
### Security & cost (commits 1792e2c, 0e380f0, 387c5b8)
11+
- **CORS info-disclosure**`functions/api/[[route]].ts` reflected an arbitrary
12+
`Origin` with `credentials:true`, letting any third-party site read a logged-in
13+
user's private JSON (`/api/auth/me`, `/api/ai/conversations`). Now an explicit
14+
allowlist (liber-99x.pages.dev + preview subdomains + localhost + `ALLOWED_ORIGINS`).
15+
- **Privilege escalation** — any wallet user could self-mint a CLI token
16+
(`/auth/cli/token`) that `platformAuth`/`adminAuth` treated as `ADMIN_TOKEN`.
17+
Platform/graph (infra/cost) endpoints now require `ADMIN_TOKEN` (constant-time)
18+
or a CLI token from an `ADMIN_WALLETS`-listed wallet. Book *publishing* still
19+
accepts any CLI token (its intended use).
20+
- **SSRF**`/books/ingest` fetched an unvalidated `sourceUrl`. Now restricted to
21+
an https public-domain host allowlist (`INGEST_HOSTS` to extend); private/loopback
22+
hosts rejected.
23+
- **Wallet-login replay**`/auth/verify` did not bind the signed message to the
24+
nonce. It now requires the message to equal the exact server template
25+
(`loginMessage`), and the nonce is single-use.
26+
- **AI cost-abuse** — cookieless `/ai/chat` + `/platform/search` ran Workers AI
27+
unmetered. Now per-IP rate-limited (`AI_RATE_PER_MIN`, default 20/min) via a D1
28+
atomic counter (migration `0011_rate_limits`). NOTE: Cloudflare KV is **not**
29+
usable for this (eventually-consistent reads); Cloudflare's native Rate Limiting
30+
binding is **not** supported in Pages config. For hard burst limits, enable AI
31+
Gateway rate limiting (`AI_GATEWAY_ID`).
32+
- **Translation cache poisoning** — an empty (non-throwing) AI response was cached
33+
as a real translation forever. `functions/lib/ai.ts` now flags it `error:true` so
34+
the route skips the cache write.
35+
36+
### User-visible bugs & quality (this batch)
37+
- **Upvotes always 0** — the feed hardcoded shares to `up:0` and the comments list
38+
returned the stale `up` column, despite votes existing in the `votes` table. Both
39+
now merge live `COUNT(*)` (`functions/routes/social.ts`).
40+
- **Dead library sort** — real books had `readsN`/`liners` hardcoded to 0 and
41+
`created_at` dropped, so "最多人读 / 划线最多 / 最近上链" did nothing. `listBooks`
42+
now computes the metrics (distinct readers from `progress`, highlights from
43+
`highlights`) and sorts in SQL; books carry `createdAt` (`functions/lib/catalog.ts`).
44+
- **Platform job double-run**`runPlatformJob` had no claim guard, so the queue
45+
consumer and `/jobs/drain` could execute the same job twice. It now claims
46+
atomically (`UPDATE ... WHERE id=? AND status!='running'`, check `meta.changes`)
47+
(`functions/lib/platform.ts`).
48+
- **Simplified→traditional wrong chars** — S2T was a lossy reverse of a many-to-one
49+
map. Ambiguous chars now default to their most-common traditional form (后→後,
50+
里→裡, 余→餘) with exception phrases for the rest (皇后→皇后, 公里→公里, 头发→頭髮),
51+
applied via a placeholder pass so phrase output survives the char map
52+
(`src/lib/zh-convert.js`).
53+
54+
### Performance & tests (this batch)
55+
- **First-paint bundle**`@mysten/sui` (~32 kB gzip) and `@simplewebauthn` were
56+
in the main chunk. Now dynamic-imported in the sign-in/subscribe handlers
57+
(`product-onboarding.jsx`, `cli-auth.jsx`, `product-profile.jsx`) → main chunk
58+
dropped ~91 kB → ~57 kB gzip; the Sui SDK loads only on sign-in.
59+
- **Tests** — extracted the money/auth verification into `functions/lib/verify.mjs`
60+
(Stripe HMAC, Sui payment matching, constant-time compare, nonce binding) with
61+
behavioral tests in `test/verify.test.mjs` — previously zero executed-code coverage.
62+
63+
## Backlog (confirmed, not yet fixed)
64+
65+
- **MED (arch)**`Reader()` is a ~927-line god component (45 useState / 23
66+
useEffect, `src/components/product-reader.jsx`). Biggest future-velocity tax.
67+
- **MED (tests)** — auth/session, AI parsing, graph echoes, and platform job state
68+
still have no runtime tests; many `functions/` tests only string-grep source.
69+
No vitest/miniflare harness.
70+
- **MED (perf)** — no route-level code-splitting (~20 screens eager); Google Fonts
71+
render-blocking `@import`; JSON GETs (`/books`, `/charts`) lack `Cache-Control`;
72+
`getChapters` does serial R2 reads (N+1).
73+
- **MED (ops)** — no error tracking/alerting (≈4 `console.*` total); the platform
74+
queue has no dead-letter queue; AI Gateway off by default; `db:migrate` re-runs
75+
all migrations by hand (safe only while every migration is `IF NOT EXISTS`).
76+
- **MED (a11y)** — book-grid cards and AppBar nav are `div`/`a`+`onClick` with no
77+
role/tabIndex/keyboard support.
78+
- **LOW** — dual catalog source-of-truth (`window.BOOKS` vs `catalog.js`);
79+
duplicated `profileRef`/seed-fallback helpers; dead `Placeholder` export;
80+
embeddings ledger records the first vector's dim for all sids; "最近上链" frontend
81+
sort branch still missing (backend now provides `createdAt`).
82+
83+
Verifier **refuted** one finding: the knowledge-graph pipeline is not
84+
"misconfigured/inert" — it is a deliberate, consistently flag-gated
85+
(`GRAPH_ENABLED`) pre-launch feature.
86+
87+
## Methodology
88+
89+
8 parallel auditors read real code and reported findings with `file:line`; every
90+
critical/high finding was re-verified by an independent adversarial agent against
91+
the cited code (which down-rated several and refuted one). Deploys are local
92+
(`npm run deploy`); migrations are applied by hand (`npm run db:migrate`).

functions/lib/auth.ts

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ import { getCookie } from "hono/cookie";
99
import type { Env, Variables } from "./types";
1010
import { first, run, id, now } from "./db";
1111
import { chain } from "./chains";
12+
import { timingSafeEqual } from "./verify.mjs";
1213

1314
type Ctx = Context<{ Bindings: Env; Variables: Variables }>;
1415

@@ -80,13 +81,9 @@ export async function createCliPublishToken(env: Env, user: UserRow): Promise<{
8081
return { token, expiresIn: CLI_TOKEN_TTL };
8182
}
8283

83-
// Constant-time string compare — avoids leaking ADMIN_TOKEN via response timing.
84-
export function constantTimeEqual(a?: string | null, b?: string | null): boolean {
85-
if (typeof a !== "string" || typeof b !== "string" || a.length !== b.length) return false;
86-
let diff = 0;
87-
for (let i = 0; i < a.length; i++) diff |= a.charCodeAt(i) ^ b.charCodeAt(i);
88-
return diff === 0;
89-
}
84+
// Constant-time string compare (shared, unit-tested impl in verify.mjs) —
85+
// avoids leaking ADMIN_TOKEN via response timing.
86+
export const constantTimeEqual = timingSafeEqual;
9087

9188
// Extract a Bearer token from the Authorization header, or null.
9289
export function bearerToken(c: Ctx): string | null {

functions/lib/catalog.ts

Lines changed: 26 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -409,6 +409,7 @@ export function normalizeBook(row: any) {
409409
long: row.description ?? row.long ?? "",
410410
featured: !!row.featured,
411411
dynamic: row.dynamic ?? true,
412+
createdAt: Number(row.created_at ?? row.createdAt ?? 0) || 0,
412413
license: row.license ?? "CC0-1.0",
413414
sourceUrl: row.source_url ?? row.sourceUrl ?? "",
414415
hasEpub: Boolean(
@@ -445,23 +446,33 @@ export async function countBooks(env: Env, opts: { cat?: string } = {}) {
445446
export async function listBooks(env: Env, opts: { cat?: string; sort?: string; limit?: number } = {}) {
446447
const cat = scopedCategory(opts.cat);
447448
const limit = Math.max(1, Math.min(Number(opts.limit) || 1000, 2000));
448-
const sql = `SELECT id, title, subtitle, author, category, lang, year, pages, words AS words_count,
449-
cover_class, seal, blurb, description, featured, walrus, arweave, sui_index,
450-
license, source_url, created_at
451-
FROM library_books
452-
${cat ? "WHERE category = ?" : ""}
453-
ORDER BY created_at DESC
449+
// Sort in SQL on real metrics so the ranking is correct even with a LIMIT.
450+
// readsN = distinct readers (progress), liners = total highlights.
451+
const sort = opts.sort || "reads";
452+
const order =
453+
sort === "lines" || sort === "划线最多" ? "liners DESC, lb.created_at DESC"
454+
: sort === "recent" || sort === "new" || sort === "最近上链" ? "lb.created_at DESC"
455+
: "readsN DESC, lb.created_at DESC";
456+
const sql = `SELECT lb.id, lb.title, lb.subtitle, lb.author, lb.category, lb.lang, lb.year, lb.pages,
457+
lb.words AS words_count, lb.cover_class, lb.seal, lb.blurb, lb.description,
458+
lb.featured, lb.walrus, lb.arweave, lb.sui_index, lb.license, lb.source_url,
459+
lb.created_at,
460+
COALESCE(p.readers, 0) AS readsN,
461+
COALESCE(h.lines_n, 0) AS liners
462+
FROM library_books lb
463+
LEFT JOIN (SELECT book_id, COUNT(DISTINCT user_id) AS readers FROM progress GROUP BY book_id) p ON p.book_id = lb.id
464+
LEFT JOIN (SELECT book_id, COUNT(*) AS lines_n FROM highlights GROUP BY book_id) h ON h.book_id = lb.id
465+
${cat ? "WHERE lb.category = ?" : ""}
466+
ORDER BY ${order}
454467
LIMIT ?`;
455-
const rows = await all<any>(
456-
env.DB,
457-
sql,
458-
...(cat ? [cat, limit] : [limit]),
459-
);
468+
const rows = await all<any>(env.DB, sql, ...(cat ? [cat, limit] : [limit]));
460469
const dynamic = rows.map(normalizeBook);
461-
let list = dynamic.length ? dynamic : S.BOOKS.map((b) => ({ ...b, dynamic: false }));
462-
if (cat && !dynamic.length) list = list.filter((b) => b.cat === cat);
463-
if (opts.sort === "lines") list = [...list].sort((a, b) => b.liners - a.liners);
464-
else list = [...list].sort((a, b) => b.readsN - a.readsN);
470+
if (dynamic.length) return dynamic;
471+
// seed fallback (no live catalog yet)
472+
let list = S.BOOKS.map((b) => ({ ...b, dynamic: false }));
473+
if (cat) list = list.filter((b) => b.cat === cat);
474+
if (sort === "lines" || sort === "划线最多") list = [...list].sort((a, b) => (b.liners || 0) - (a.liners || 0));
475+
else if (!(sort === "recent" || sort === "new" || sort === "最近上链")) list = [...list].sort((a, b) => (b.readsN || 0) - (a.readsN || 0));
465476
return list;
466477
}
467478

functions/lib/platform.ts

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -378,12 +378,27 @@ export async function renderShareCard(env: Env, payload: Record<string, unknown>
378378
return { id: assetId, key, contentType: "image/png", width: 1200, height: 630 };
379379
}
380380

381+
// A job 'running' longer than this is treated as crashed (its isolate was
382+
// evicted / timed out before the failure branch ran) and becomes re-claimable.
383+
// Far longer than any real platform job, so a live job is never reclaimed.
384+
const PLATFORM_JOB_STALE_MS = 10 * 60 * 1000;
385+
381386
export async function runPlatformJob(env: Env, input: string | PlatformQueueMessage) {
382387
const jobId = typeof input === "string" ? input : input.id;
383388
const row = await first<any>(env.DB, `SELECT * FROM platform_jobs WHERE id = ?`, jobId);
384389
if (!row) throw new Error("未找到任务");
385390
const payload = parsePayload(row.payload);
386-
await run(env.DB, `UPDATE platform_jobs SET status = 'running', attempts = attempts + 1, started_at = ?, updated_at = ? WHERE id = ?`, now(), now(), jobId);
391+
// Atomically claim the job so concurrent callers (queue consumer vs /jobs/drain
392+
// vs /jobs/:id/run) cannot both execute it (double share-card render / double
393+
// Vectorize upsert). The claim also reclaims a job stuck 'running' past
394+
// PLATFORM_JOB_STALE_MS — a crashed prior run — which would otherwise never
395+
// complete (the queue silently acks the skipped retry; there is no DLQ).
396+
const claim = await run(
397+
env.DB,
398+
`UPDATE platform_jobs SET status = 'running', attempts = attempts + 1, started_at = ?, updated_at = ? WHERE id = ? AND (status != 'running' OR started_at < ?)`,
399+
now(), now(), jobId, now() - PLATFORM_JOB_STALE_MS,
400+
);
401+
if (!claim.meta?.changes) return { id: jobId, status: "running", skipped: true, reason: "任务已在执行中" };
387402
try {
388403
const result = row.type === "index-book"
389404
? await indexBookSemantics(env, String(row.target_id || payload.bookId || ""))
@@ -414,12 +429,16 @@ export async function runPlatformJob(env: Env, input: string | PlatformQueueMess
414429
}
415430

416431
export async function runDuePlatformJobs(env: Env, limit = 5) {
432+
// Pick up due queued jobs, plus jobs stuck 'running' past the stale window
433+
// (crashed prior runs) so they get re-claimed and completed rather than lost.
417434
const rows = await all<any>(
418435
env.DB,
419436
`SELECT id FROM platform_jobs
420-
WHERE status = 'queued' AND (run_after IS NULL OR run_after <= ?)
437+
WHERE (status = 'queued' AND (run_after IS NULL OR run_after <= ?))
438+
OR (status = 'running' AND started_at IS NOT NULL AND started_at < ?)
421439
ORDER BY priority DESC, created_at ASC LIMIT ?`,
422440
now(),
441+
now() - PLATFORM_JOB_STALE_MS,
423442
Math.max(1, Math.min(limit, 20)),
424443
);
425444
const results: any[] = [];

functions/lib/verify.mjs

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
// Pure security/verification helpers, shared by the routes and unit-tested in
2+
// test/verify.test.mjs. Written as plain ESM JS (not .ts) so `node --test` on
3+
// CI (node 20, no type-stripping) can import it directly; the .ts routes import
4+
// it via allowJs. These are the app's highest-value money/auth checks, so they
5+
// live in one tested place rather than inline in route handlers.
6+
import { normalizeSuiAddress } from "@mysten/sui/utils";
7+
8+
// Constant-time string compare — avoids leaking a secret (ADMIN_TOKEN, an HMAC)
9+
// via response timing. Returns false for non-strings or length mismatch.
10+
export function timingSafeEqual(a, b) {
11+
if (typeof a !== "string" || typeof b !== "string" || a.length !== b.length) return false;
12+
let out = 0;
13+
for (let i = 0; i < a.length; i++) out |= a.charCodeAt(i) ^ b.charCodeAt(i);
14+
return out === 0;
15+
}
16+
17+
export async function hmacHex(secret, payload) {
18+
const enc = new TextEncoder();
19+
const key = await crypto.subtle.importKey("raw", enc.encode(secret), { name: "HMAC", hash: "SHA-256" }, false, ["sign"]);
20+
const sig = await crypto.subtle.sign("HMAC", key, enc.encode(payload));
21+
return [...new Uint8Array(sig)].map((b) => b.toString(16).padStart(2, "0")).join("");
22+
}
23+
24+
// Stripe webhook signature: HMAC-SHA256 over `${t}.${body}`, compared in
25+
// constant time against any v1 signature in the stripe-signature header.
26+
export async function validStripeSignature(secret, header, body) {
27+
if (!secret || !header) return false;
28+
const parts = header.split(",").map((p) => p.split("="));
29+
const timestamp = parts.find(([k]) => k === "t")?.[1];
30+
const sigs = parts.filter(([k]) => k === "v1").map(([, v]) => v);
31+
if (!timestamp || !sigs.length) return false;
32+
const expected = await hmacHex(secret, `${timestamp}.${body}`);
33+
return sigs.some((sig) => timingSafeEqual(sig, expected));
34+
}
35+
36+
export function ownerAddress(owner) {
37+
if (typeof owner === "string") return owner;
38+
if (owner?.AddressOwner) return owner.AddressOwner;
39+
if (owner?.addressOwner) return owner.addressOwner;
40+
return null;
41+
}
42+
43+
export function sameSuiAddress(a, b) {
44+
if (!a || !b) return false;
45+
try {
46+
return normalizeSuiAddress(a) === normalizeSuiAddress(b);
47+
} catch {
48+
return a.toLowerCase() === b.toLowerCase();
49+
}
50+
}
51+
52+
// True when `balanceChanges` shows at least `amount` (atomic units) of
53+
// `coinType` credited to `treasury`. Mirrors the /crypto/confirm payment check
54+
// so a forged/short/wrong-coin transfer cannot unlock Pro.
55+
export function paymentReceived(balanceChanges, { coinType, treasury, amount }) {
56+
let need;
57+
try { need = BigInt(amount); } catch { return false; }
58+
return (balanceChanges || []).some((bc) => {
59+
if (bc.coinType !== coinType) return false;
60+
if (!sameSuiAddress(ownerAddress(bc.owner), treasury)) return false;
61+
try { return BigInt(bc.amount) >= need; } catch { return false; }
62+
});
63+
}
64+
65+
// The exact message a wallet must sign to log in. Issued by /auth/nonce and
66+
// re-derived by /auth/verify, which rejects a signature whose message does not
67+
// equal this — binding the signature to the single-use nonce (anti-replay).
68+
export function loginMessage(nonce) {
69+
return `Liber 登录\nnonce: ${nonce}`;
70+
}

functions/routes/auth.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ import {
1010
passkeyRegisterOptions, passkeyRegisterVerify, passkeyLoginOptions, passkeyLoginVerify,
1111
} from "../lib/passkey";
1212
import { readingStats } from "../lib/reading-summary";
13+
import { loginMessage } from "../lib/verify.mjs";
1314
import { run } from "../lib/db";
1415

1516
const auth = new Hono<{ Bindings: Env; Variables: Variables }>();
@@ -19,7 +20,7 @@ const cookieOpts = { httpOnly: true, secure: true, sameSite: "Lax" as const, max
1920
// 1) request a nonce + the message the wallet should sign
2021
auth.post("/nonce", async (c) => {
2122
const nonce = await issueNonce(c.env);
22-
return c.json({ nonce, message: `Liber 登录\nnonce: ${nonce}` });
23+
return c.json({ nonce, message: loginMessage(nonce) });
2324
});
2425

2526
// 2) verify a wallet signature via the active chain adapter, then mint a session.
@@ -29,7 +30,7 @@ auth.post("/verify", async (c) => {
2930
// server-issued message template (see /nonce). Without this, message/nonce are
3031
// independent fields and a captured (message, signature) pair could be replayed
3132
// with a freshly requested nonce. The nonce itself is single-use (consumed).
32-
if (typeof message !== "string" || message !== `Liber 登录\nnonce: ${nonce}`) {
33+
if (typeof message !== "string" || message !== loginMessage(nonce)) {
3334
return c.json({ error: "签名消息与 nonce 不匹配" }, 400);
3435
}
3536
if (!(await consumeNonce(c.env, nonce))) return c.json({ error: "nonce 无效或已过期" }, 400);

0 commit comments

Comments
 (0)