Skip to content

Design Decisions

eugnmueller-87 edited this page Jun 24, 2026 · 1 revision

Design Decisions

The decisions that shaped Aushang, with the reasoning straight from the code and docs. Honest about trade-offs and the demo-vs-production line.

Redact the text, not the image

Decision Run PII redaction on the OCR'd text before the LLM call; store the original (unblurred) photo as the member image.
Why A notice board is public — the same photo hangs on the wall. Its text isn't sensitive to the org's own members. The real privacy risks are (1) shipping raw PII to an external LLM and (2) uncontrolled access to the raw original. Both are handled directly.
Earlier plan Phase 1/2 docs described blurring redacted regions in the image (word boxes are still captured in ocr.py for this). Real-world testing showed blurred Rückblick photos were too poor to be useful → dropped the blur, locked down access instead.
Trade-off A member with consent can see real children's faces on a Rückblick — so reflection originals are deleted at publish and the clear-photo path is force-blocked for them.

Fail-closed redaction, tuned not to mangle

Decision Mask anything ≥ threshold; over-mask rather than under-mask. But raise per-entity floors and exclude LOCATION.
Why "Over-masking costs one tap in review; under-masking leaks PII." But a board is full of dates, town names, and festival headings; naive ML masking mangled real notices. So deterministic high-signal PII (phone/email/IBAN/birthdate) is caught by a regex pack at confidence 1.0, while fuzzy spaCy guesses (PERSON ≥0.6, ML PHONE_NUMBER ≥0.85) are held to higher bars and LOCATION is dropped entirely.
Backstop The admin reviews every draft — the human is the final redaction check.

LLM advises, deterministic code decides

Decision Two columns: content_type_suggested (LLM) and content_type (admin-confirmed). Routing reads only the confirmed value, which is nullable with no default.
Why A wrong LLM guess must never auto-route or auto-create a calendar event. NULL ("unconfirmed") is deliberately distinct from info (a real fallback). Nothing reaches a member without an explicit admin tap.
Bonus Raw LLM output stays immutable in posts.extraction; admin edits live in post_details, so edits survive without re-running the LLM.

Multi-provider LLM, Claude by default

Decision The worker can call Anthropic / Mistral / OpenAI / Gemini, selected by LLM_PROVIDER; all return the same validated ExtractionEnvelope. Maintainer's deployment uses Claude (claude-haiku-4-5).
Why Provider choice doesn't change the privacy model (only redacted text is ever sent) but it does change data residency: Anthropic/OpenAI are US; Mistral (La Plateforme) is EU. Self-hosters pick a provider + bring their own key. Set LLM_PROVIDER=mistral for strict EU residency — nothing else in the pipeline changes.
Honest note The default (Claude, US) is disclosed on /datenschutz; moving extraction into the EU is a tracked follow-up.
Engineering Each provider's structured-output mechanism differs: Anthropic output_config/json_schema, OpenAI json_schema strict, Gemini responseSchema (strips additionalProperties + array types), Mistral json_object with the schema embedded in the prompt. Strict mode forbids oneOf, so all five typed sub-payloads ride as nullable siblings under details; extract() collapses the matching branch. (Anthropic caps schemas at 16 nullable params, so only genuinely date-semantic fields stay nullable.)

Operator-provisioned three-role model

Decision superadmin / admin / member; no public signup, no self-service join. (Supersedes the brief's original two-role, self-service design.)
Why Cleaner trust model: the operator creates orgs and the first admin; admins manage their own members. Removed an entire class of onboarding-link attacks (the Phase 1 review found editable magic-link intent was a privilege-escalation vector).
Cost The old self-service tables (invites/join_requests/pending_onboarding) were deleted in 0005; delete_user_account (which could orphan an org) was dropped in 0007.

Security at four independent layers

Middleware + route guards + security-definer RPCs + RLS/column-grants. Any single layer failing doesn't breach the system. Documented as "the security model is the architecture" — touching one layer requires calling it out in the PR.

Column-REVOKE for PII (not just a view)

Decision REVOKE SELECT on posts from authenticated, re-GRANT only non-PII columns.
Why RLS gates rows, not columns; a member could read PII columns from the base table, bypassing the posts_public view (the Phase 1 critical finding). REVOKE makes admin PII access server-only by construction — even an admin's browser client can't read PII.

Double-gated, default-off photo consent

Member opt-in AND admin per-post release, evaluated server-side, delivered only via a short-TTL signed URL. Both defaults false ⇒ zero-backfill safety. Per-viewer "see the real photo" was deliberately rejected for reflections (the "multiple children" problem — see docs/COVER_IMAGES_SPEC.md).

Native Android via Capacitor remote-URL shell

Decision A remote-URL native shell (loads the server-rendered app) + native camera (@capacitor/camera) feeding the same redaction pipeline.
Why The server-rendered app keeps its full security model unchanged; the native layer adds only camera + launcher icons + a cloud AAB build. iOS later from the same project (needs a Mac).

AI cover illustrations — built but dormant, fail-open

Decorative covers generated from the redacted extraction (no PII, no people, object-only scenes keyed to content_type not the notice's specifics, so a cheerful image can't land next to an illness notice). Same privacy boundary as the text call. Inert until an EU FLUX.1 [schnell] endpoint is configured; fail-open — a missing cover never fails a post.

Honest demo-vs-production lines

Area Status
Worker transport Reachable over VPS HTTP today; TLS front (Caddy/Traefik + worker. subdomain) is a follow-up.
DB types database.types.ts is a hand-authored stub, not generated.
CSP Report-only, not enforcing.
Rate limiting Login/recovery relies on Supabase built-ins; app-level token bucket is a hardening item (QR apply has its own per-code DB limit).
npm audit postcss A transitive advisory inside Next.js's tree; the "fix" downgrades Next to 9.x and is intentionally not applied.
LLM residency Default extraction is US (Claude); EU move is tracked, disclosed on /datenschutz.
Covers Built, dormant.