-
Notifications
You must be signed in to change notification settings - Fork 0
Design Decisions
The decisions that shaped Aushang, with the reasoning straight from the code and docs. Honest about trade-offs and the demo-vs-production line.
| Decision | Run PII redaction on the OCR'd text before the LLM call; store the original (unblurred) photo as the member image. |
| Why | A notice board is public — the same photo hangs on the wall. Its text isn't sensitive to the org's own members. The real privacy risks are (1) shipping raw PII to an external LLM and (2) uncontrolled access to the raw original. Both are handled directly. |
| Earlier plan | Phase 1/2 docs described blurring redacted regions in the image (word boxes are still captured in ocr.py for this). Real-world testing showed blurred Rückblick photos were too poor to be useful → dropped the blur, locked down access instead. |
| Trade-off | A member with consent can see real children's faces on a Rückblick — so reflection originals are deleted at publish and the clear-photo path is force-blocked for them. |
| Decision | Mask anything ≥ threshold; over-mask rather than under-mask. But raise per-entity floors and exclude LOCATION. |
| Why | "Over-masking costs one tap in review; under-masking leaks PII." But a board is full of dates, town names, and festival headings; naive ML masking mangled real notices. So deterministic high-signal PII (phone/email/IBAN/birthdate) is caught by a regex pack at confidence 1.0, while fuzzy spaCy guesses (PERSON ≥0.6, ML PHONE_NUMBER ≥0.85) are held to higher bars and LOCATION is dropped entirely. |
| Backstop | The admin reviews every draft — the human is the final redaction check. |
| Decision | Two columns: content_type_suggested (LLM) and content_type (admin-confirmed). Routing reads only the confirmed value, which is nullable with no default. |
| Why | A wrong LLM guess must never auto-route or auto-create a calendar event. NULL ("unconfirmed") is deliberately distinct from info (a real fallback). Nothing reaches a member without an explicit admin tap. |
| Bonus | Raw LLM output stays immutable in posts.extraction; admin edits live in post_details, so edits survive without re-running the LLM. |
| Decision | The worker can call Anthropic / Mistral / OpenAI / Gemini, selected by LLM_PROVIDER; all return the same validated ExtractionEnvelope. Maintainer's deployment uses Claude (claude-haiku-4-5). |
| Why | Provider choice doesn't change the privacy model (only redacted text is ever sent) but it does change data residency: Anthropic/OpenAI are US; Mistral (La Plateforme) is EU. Self-hosters pick a provider + bring their own key. Set LLM_PROVIDER=mistral for strict EU residency — nothing else in the pipeline changes. |
| Honest note | The default (Claude, US) is disclosed on /datenschutz; moving extraction into the EU is a tracked follow-up. |
| Engineering | Each provider's structured-output mechanism differs: Anthropic output_config/json_schema, OpenAI json_schema strict, Gemini responseSchema (strips additionalProperties + array types), Mistral json_object with the schema embedded in the prompt. Strict mode forbids oneOf, so all five typed sub-payloads ride as nullable siblings under details; extract() collapses the matching branch. (Anthropic caps schemas at 16 nullable params, so only genuinely date-semantic fields stay nullable.) |
| Decision |
superadmin / admin / member; no public signup, no self-service join. (Supersedes the brief's original two-role, self-service design.) |
| Why | Cleaner trust model: the operator creates orgs and the first admin; admins manage their own members. Removed an entire class of onboarding-link attacks (the Phase 1 review found editable magic-link intent was a privilege-escalation vector). |
| Cost | The old self-service tables (invites/join_requests/pending_onboarding) were deleted in 0005; delete_user_account (which could orphan an org) was dropped in 0007. |
Middleware + route guards + security-definer RPCs + RLS/column-grants. Any single layer failing doesn't breach the system. Documented as "the security model is the architecture" — touching one layer requires calling it out in the PR.
| Decision |
REVOKE SELECT on posts from authenticated, re-GRANT only non-PII columns. |
| Why | RLS gates rows, not columns; a member could read PII columns from the base table, bypassing the posts_public view (the Phase 1 critical finding). REVOKE makes admin PII access server-only by construction — even an admin's browser client can't read PII. |
Member opt-in AND admin per-post release, evaluated server-side, delivered only via a short-TTL signed URL. Both defaults false ⇒ zero-backfill safety. Per-viewer "see the real photo" was deliberately rejected for reflections (the "multiple children" problem — see docs/COVER_IMAGES_SPEC.md).
| Decision | A remote-URL native shell (loads the server-rendered app) + native camera (@capacitor/camera) feeding the same redaction pipeline. |
| Why | The server-rendered app keeps its full security model unchanged; the native layer adds only camera + launcher icons + a cloud AAB build. iOS later from the same project (needs a Mac). |
Decorative covers generated from the redacted extraction (no PII, no people, object-only scenes keyed to content_type not the notice's specifics, so a cheerful image can't land next to an illness notice). Same privacy boundary as the text call. Inert until an EU FLUX.1 [schnell] endpoint is configured; fail-open — a missing cover never fails a post.
| Area | Status |
|---|---|
| Worker transport | Reachable over VPS HTTP today; TLS front (Caddy/Traefik + worker. subdomain) is a follow-up. |
| DB types |
database.types.ts is a hand-authored stub, not generated. |
| CSP | Report-only, not enforcing. |
| Rate limiting | Login/recovery relies on Supabase built-ins; app-level token bucket is a hardening item (QR apply has its own per-code DB limit). |
npm audit postcss |
A transitive advisory inside Next.js's tree; the "fix" downgrades Next to 9.x and is intentionally not applied. |
| LLM residency | Default extraction is US (Claude); EU move is tracked, disclosed on /datenschutz. |
| Covers | Built, dormant. |
Aushang — Privacy-by-construction notice-board digitization · Repository · Built by Eugen Müller
Overview
Deep dives