Combined and comprehensive source of record across all field notes, chat exports, project explorations, and sensemaking work. Replaces separate sensemaking-map.md (now deleted). Use synthesis.md for the strategic/mentor-facing layer built on top of this.
-
Ritual + intent definition: Sparkle bureaucracy = bureaucracy that keeps the rituals and structure of official systems but makes them playful, whimsical, or delightful instead of oppressive. Think: forms, stamps, checkpoints, procedures — but redesigned to create joy, theatre, and shared experience. Typical traits: keeps bureaucratic rituals (forms, approvals, stamps); adds aesthetic charm (stickers, colorful documents, funny titles); turns processes into participatory theatre; uses light-hearted rules rather than strict enforcement. The structure mimics government bureaucracy, but the goal is delight and immersion, not control.
-
Organisational futures definition: Sparkle Bureaucracy is a network of people prototyping optimistic organisational futures for the age of AI — reimagining how institutions feel and decide, not only claiming to improve delivery.
-
Brand positioning definition (Ed Saperia): Talent-focused, not delivery-focused. About education, creativity, younger people. An update of "civic tech" but for AI. About using AI to explore organisational/service futures — outputs meant to illustrate the power of technology and what new possibilities it brings, not to be applied directly. Deliberately sidesteps strategic policy, which makes it less threatening in political environments.
-
Vehicle definition (James): A vehicle for carrying a particular approach into places that are otherwise "unsexy," boring, or banal. Not a single product — a frame that lets a lot of different work travel together.
-
Working design principle (Fatima): 80% credible institution, 20% Fatima-coded sparkle bureaucracy. The word "bureaucracy" signals unexpected seriousness (similar to "governance") — without it, the concept might seem too silly.
-
Related terms / adjacent vocabulary: playful bureaucracy; ritualized administration; bureaucratic theatre; administrative cosplay. Appears in art installations, immersive events, civic design experiments, and playful governance prototypes where administrative systems are reimagined as social experiences.
- Control → experience
- Compliance → engagement
- Opacity → legible ritual
Sparkle Bureaucracy works because it combines institutional seriousness with emotional accessibility. "Bureaucracy" signals credibility and system-level ambition. "Sparkle" lowers threat, invites participation, and makes experimentation socially acceptable in environments that are typically defensive. This enables practical exploration of AI-era service and organisational futures without triggering immediate policy conflict.
Trust in institutions is low. A common response is to demand more transparency, but transparency alone can intensify distrust by making institutional shortcomings more visible without changing lived experience. Sparkle Bureaucracy proposes a re-legitimisation pathway: redesign procedural encounters so they are legible, humane, participatory, and culturally resonant. The goal is not to trivialise administration, but to repurpose administrative ritual from fear/friction toward trust/connection.
- Talent-focused, not delivery-focused. This is about education, creativity, younger people — not about "effective, trustworthy, efficient, cheap."
- Brings together AI and the operational side of policy — like an update of civic tech for AI. Outputs are not meant to be applied directly, but to illustrate what technology makes possible and what new possibilities it brings. Deliberately sidesteps strategic policy.
- The word "bureaucracy" is unusual — unexpected seriousness, similar to "governance." Gives the brand weight without being corporate.
- The brand makes it unthreatening — suggests cool ideas, fun people, a good time. Especially important in bureaucratic environments where people's spirits are usually crushed.
- Feminist + "for good" + AI angles make it easy to support and sponsor. Feels fresh.
- Hinges on Fatima being an amazing ambassador. Asks that are easy to say yes to: from the tech side this is sponsorship, from the civic side this is visibility. Good photos and cute branding help. Background is strong enough to stay credible.
- Build a portfolio of projects that encapsulate the brand. Can take time — don't need them at launch. Shouldn't all be things Fatima made — it's also a way to make alliances with people doing cool stuff.
- A way to promote concrete values — like how vTaiwan is associated with pol.is. Movement + associated tool.
- Important new projects don't exist yet; they'll be AI-native. When tech is made with AI it's much cheaper, so making it "sparkly" is cheap — might as well have things be fun and beautiful as well as functional. Custom Matrix channels as an example: tech is so cheap a chat channel can be a whole custom project.
- The evaluation keynote thing can be Sparkle Bureaucracy branded.
- People are hungry for optimistic futures.
- People want to understand what post-AI jobs look like.
- One very powerful outcome: helping make post-AI team compositions palatable to public sector. This is a big part of what the service design movement did.
- Calendar strategy: put other people's events in the Sparkle Bureaucracy calendar without needing permission — it's just highlighting things you find interesting. Another way of thinking about the project: institutionalising your personal interests to give them more weight.
- Code review agents feel important — "feels of the same universe as Sparkle Bureaucracy." (Theme: AI-native governance tools.)
- Speculative future formats: Sparkle Bureaucracy Camp; Sparkle Bureaucracy Awards.
- "Something sparkle bureaucracy themed is good enough." Low-bar asks are powerful.
- "Sparkle Bureaucracy as a brand alone is really good — for some sort of AI/civic tech programme/project/articulation."
- "He's advocating for a vehicle called sparkle bureaucracy that is trying to carry forward a particular approach into places where it is otherwise'boring,' or banal, to engage with."
- Faculty — public-sector AI consultancy. Explicitly suggested by Ed.
- TPXimpact — digital/public impact org. Explicitly suggested by Ed.
- Creative Bureaucracy Festival — creativebureaucracy.org — anchor event in the ecosystem.
- Studio Sanshin — studio-sanshin.com — stylistically and thematically aligned collaborator.
- Martin Dittus — martindittus.info — "got it immediately"; early network signal.
- OneTeamGov — oneteamgov.uk — "radical reform through practical action" — adjacent positioning.
- Google.org AI Government Innovation — link — funder-facing alignment.
- MHCLG Local AI / digital public services — MHCLG blog — policy-side ecosystem signal.
- UKAuthority digital public services event — link — visibility/speaking opportunity.
- Re-legitimisation discussion (James Plunkett / Kinship Works) — LinkedIn post — theoretical / social backing for the trust-rebuilding angle.
- vTaiwan / pol.is — precedent/analogy for movement + associated tool.
- Portfolio approach: many small prototypes, not one grand solution. Not all things Fatima made — also a way to form alliances.
- "Easy yes" asks: visibility, co-hosting, lightweight sponsorship, demo participation.
- Convening as strategy: calendar, events, shared practice community.
- Network Development module: build a visible network around shared practice; use anchors and adjacencies; institutionalise personal interests into a recognisable platform with collective gravity.
- Knowledge Production module: convert prototypes into reusable knowledge assets (case notes, design patterns, failure notes); produce evidence around trust, service experience, and organisational learning; publish short, high-frequency outputs.
- Sparkle Bureaucracy is a network of people prototyping optimistic organisational futures for the age of AI.
- Sparkle Bureaucracy is a network reimagining public-sector organisational futures for the AI age.
- Sparkle Bureaucracy prototypes optimistic public-sector futures for the AI age, focused on institutional re-legitimisation.
- URL: https://sparkle-border-authority.vercel.app/
- Context: Built for a live event — "Ration Club Border Control" — where guests went through a full "border crossing" journey before entering a party.
- Stack: React + TypeScript + Vite, polished terminal-style interface.
- Core mechanic: 4-character guest code tied to a registry. Users move through identity confirmation → purpose/declaration forms → screening → decisioning → visa printing → checkpoint handoff → arrival tracking.
- Behaves like a miniature digital border operating system.
- Combines structured intake forms, rules-based validation, a decision engine, and route-level state passing to automate case handling.
- Applicants can be approved/rejected, assigned visa classes and privileges, and issued printable A6 "visa stickers."
- Special pathways: manual visa creation, visitor signup, checkpoint assistance — edge cases handled without breaking flow.
- Self-service processing, secondary screening logic, and real-time operational dashboards all in a single kiosk experience.
- Tracks runtime metrics: entries, approvals, rejections, document distribution.
- Persists state in browser storage for continuity across sessions.
- Admin tools for staff interventions: overrides, reprints.
- Administrative cosplay: Preserves the skeleton of a border regime (identity checks, declarations, secondary screening, approvals, printed permits, checkpoint validation) while replacing fear and friction with spectacle.
- Formal sequencing + playful wrapping: Official-feeling language wrapped in celestial visuals, playful copy, and celebratory interaction design — compliance feels like participation in a story world.
- Mundane mechanics as ritual: Entering a 4-character code, selecting purpose-of-visit options, answering screening prompts, receiving status decisions, printing an A6 visa — all procedurally familiar. But rituals signal shared theatre rather than punishment.
- Documents and dashboards as state apparatus: Visa classes, privileges, checkpoint routes, live stats, admin overrides mimic institutional systems but are tuned for delight, pacing, and immersion.
- Language layer: "Sparkle compliance," "diplomatic glitter," "excellent vibes" — turns serious border-tech patterns into social performance.
- Rejection flows: Even rejection and assistance flows feel narrative, not punitive — preserving emotional arc while maintaining structure.
- Not just "more digital" border control — "more experiential": border patrol as programmable UX, real-time analytics, narrative world-building.
- Procedure becomes art. Paperwork becomes props. Administration becomes collective experience.
- Social choreography: formal process repurposed from control into connection — same rituals, different intent.
- Re-legitimisation angle: an early prototype for how public-facing institutions might regain legitimacy by redesigning the felt experience of procedure. Transparency alone doesn't rebuild trust; this explores a path of making institutions more participatory, humane, and culturally intelligible through designed ritual.
- Design grammar (structured process, legible state transitions, shared narrative, assistive overrides) is portable to civic service prototyping — local government AI pilots, digital public service design, creative bureaucracy forums.
- Event: Political Technology Awards Showcase, Cohort 25/26 — evening event with food, presentations, live Q&A.
- Brief: Evaluate 321 political technology projects and select a winner, with entire evaluation process made open and public. Published algorithms, public GitHub repo with inspectable code, pull requests with rationale for every change.
- Core thesis of the process: Rankings are political. Scoring is political. By making process transparent and iterative, trade-offs inherent in any evaluation framework become discussable.
Every evaluation system built came back to three components:
- Data — what do we actually know about these projects, and how do we assess and verify it? Started with only URLs and scraped content; first realisation was: "the data we have is not enough."
- Values — political technology evaluation is inherently political because different values lead to different conclusions. Articulating values, writing them down, agreeing on a set — hard work.
- Facilitation — how do we actually apply those values to the data and make decisions about scoring? The operational bridge between the two.
- V1: Random scoring.
- V2: Exclusion keyword bonus — keyword-based approaches with some human-based scoring.
- V3: Keyword clusters — structured keyword groupings.
- V4: AI governance body bonus.
- V1–V4 realisation: A URL or scraped page content doesn't tell you enough about a project — no impact, context, or comparative signal. First shift: build dossiers.
- Dossier buildout: Civic tech field guide taxonomy collected for every project. Databases: OpenAlex for academic citations, ProPublica for nonprofit financials. Verification pass. Still found issues — overlapping data, unrelated citations — but significantly more data.
- V5–V7: Articulating values. Each committee member took their own lens and turned it into a structured taxonomy for scoring. Gamithra built the ITN/A framework and a multi-jury AI system.
- V8–V10: Decision systems. Competing designs: heterogeneous criteria aggregation (everyone brings own value system, aggregate) vs shared criteria (one set, multiple judges apply it). These produce very different outcomes — one more consistent, one more pluralistic.
- V10 leap: What if the judges are the system? Introduction of synthetic users / agents. Jamie laid groundwork on synthetic user theory — what they are, how they can be useful, how you measure whether they work.
- V11: Aggregation pass — pairwise comparison, averaging — to see which projects performed consistently across all value systems.
- V14–V15: Social choice deliberation round (see Project Mirror below).
- Built alongside committee process.
- Audience at the event submits their own criteria via QR code / form.
- Agent generates a ranking of shortlisted projects based on submitted values.
- Turns evaluation into a live participatory civic experience.
- Form closed during break; results processed before announcement.
What it is: An evaluator-estimator workflow. Estimates how individual cohort members might evaluate the 321 projects, based on each person's public record and bio. Research prototype — does not claim to reconstruct people's actual beliefs or simulate cognition. Takes public records and turns them into a ranking and scoring system.
Core research question: Can AI infer a usable evaluative constitution for a person from their public record of work? And if so, which project rankings stay stable when the decision procedure varies — across inference, scoring, aggregation, and deliberation? What does this reveal about AI systems as political and evaluative tools?
Why this was built: In an ideal world, all 18 cohort members would have been involved. Sitting with all of them, getting their values, ranking projects together. A system like that doesn't scale. So: build an agent for each of them.
Pipeline stages:
- Swarm of ten agents: research people's public profiles, verify them, collect evidence, build a constitution for each cohort agent.
- Constitution: not just scoring criteria but value modifiers — things that boost or reduce a score based on inferred preferences.
- Boosts: community ownership or governance; under-resourced civic contexts; inclusive developer communities.
- Reductions: VC funding (inferred preference against); popularity discount (counterbalancing LLM familiarity bias toward well-known projects that score higher because the model "knows" them better).
- Result: 18 different agents, each with a different constitution, each producing a different scored ranking, all with rationale for every score.
Social choice / grand jury round (V15):
- 18 agents with constitutions — what if they sat together as a grand jury and had a discussion?
- Applied social choice theory: 16 synthetic agents ranked 38 candidates under Borda count.
- Borda count chosen because it is the most standard social choice rule for ranked preferences and theoretically the most susceptible to manipulation — if honest result is stable under Borda with strategic reasoning, it's likely stable under anything.
- Each agent knows its honest standings and can reason strategically. What would the committee choose?
- Winner: Liquid Feedback — unchanged from honest voting. Every agent independently concluded the 18-point lead was too large to overcome. 13 arguments in deliberation. No revisions. No strategic manipulation that held.
Closing reflection (verbatim): "This is a picture of my screen at home, running all 18 agents at the same time — all of them ranking, scoring, running through the research. And I remember feeling so completely delighted, because all 18 of my cohort were with me on one screen, working on this project together. It's a little bittersweet, because these are synthetic representations. But I think where we are is the cusp of something. If I was able to design a system where synthetic versions of the cohort could work with me — then maybe the next step is not just synthetic versions of us, but us managing those versions. Technology that keeps us in the judgment loop and lets us work together better."
What it shows: With good data, clearly articulated values, and a system for applying them, you can score 321 political technology projects. But each of those pieces introduces trade-offs — around data quality, value definition, and evaluation design. AI can help articulate values, apply them at scale, and explore how different systems produce different outcomes. But data quality, value clarity, and system design are all political decisions, made all the way through.
Portfolio logic: Project Mirror + Sparkle Border Authority together show two layers:
- Service encounter layer: redesign how institutional procedure feels in lived interaction (Sparkle Border Authority).
- Decision constitution layer: redesign how institutional judgment is articulated and applied (Project Mirror).
- Spark-test topic matrix — What: scored exploration list across AI safety/governance/public-sector topics. Does: supports scope selection without premature commitment. Who: researcher, mentors.
- System-boundary diagnostic prompts — What: prompts on model vs system behavior and attack surfaces (where does "model behavior" end and "system behavior" begin? what components introduce new attack or inference surfaces?). Does: identifies blind spots in current eval assumptions. Who: evaluators, system designers.
- Evaluation blind-spot instrumentation checklist — What: prompts asking what must be logged to notice failures that standard evals miss (which failures are invisible to standard model evals? what would you need to log or instrument to even notice them?). Does: shifts explainability work toward operational telemetry. Who: engineers, auditors.
- RAG/agent/tool failure framing — What: question of how composite systems fail in surveillance-like ways; system behaviors → real harms (identity inference, hallucination chains, autonomy drift). Does: moves analysis from model outputs to pipelines and interactions. Who: AI researchers, builders.
- Control illusion diagnostic — What: explicit question about what developers think they control vs what actually controls outputs in practice. Does: exposes gap between intended and actual system behavior. Who: builders, evaluators.
- Model eval/red-team exploration track — What: safety/eval references around dangerous behavior tests, identity inference, misuse evals, benchmark building, surveillance algorithms. Does: anchors abstract safety concerns in concrete methods. Who: evaluator community.
- Agentic workflow risk lens — What: focus on orchestration complexity, tool-use surface area, extension collapse risk. Does: surfaces emergent behavior in tool-using systems; most hand-rolled extension systems collapse under complexity. Who: developers, system designers.
- AI-coded software vulnerability watchlist — What: notes on hidden vulnerabilities in AI-generated code; DeepSeek-R1 finding that politically sensitive prompts increase likelihood of severe security vulnerabilities by up to 50%. Does: extends governance concerns into software assurance. Who: engineers, security practitioners.
- Cline bench initiative — What: benchmark/eval initiative for agentic systems. Does: anchors evaluation practice for coding agents specifically. Who: engineering community.
- Roles-over-committees governance model — What: named duties (
website maintainer,ration club manager) replacing diffuse committees. Does: makes accountability explicit; committees don't do responsibility well. Who: cohort members, organizers. - Temporary-authority governance experiments — What: bounded rule-play (e.g., temporary dictator + signed charter thought experiment). Does: stress-tests whether legitimacy can be produced through explicit rules. Key friction: "Even if you were dictator and there was a signed charter, why would we follow the rules?" — usually said in good faith but reveals the politics of convincing people that a thing they opted into has value. Who: participants in governance module.
- Governance module legitimacy field note — What: reflection on the fundamental tension in the governance module — feeling "stuck" when experiments hit the question of whether legitimacy can be designed or only negotiated politically. Does: surfaces the gap between "opting in" and genuine buy-in. Who: researcher, cohort.
- Permission-slip artifact — What: explicit permission object (
Permission Slipproject on Civic Tech Guide) in a nominally rule-light system. Does: provides legitimacy scaffolding for action where informal norms are insufficient. Who: participants needing mandate. - Permission-in-rule-light-systems field note — What: extended reflection on needing permission in a peer-governed setting despite understanding the design intent. Key insight: "I don't experience this as a lack of confidence" — it's about the structure of the system, not personal failure. Does: exposes invisible norms and unequal capacity to act; "no one needs permission" coexists with some actors being unable to act without explicit authorization. Who: researcher, peers.
- Shadow governance articulation — What: naming informal labor (coordination, lobbying, housekeeping, legitimacy brokering) in groups that reject formal authority. Does: makes hidden governance visible as governance; "governance doesn't just disappear, it gets distributed to those who hold the most labor." Personal observation: occupying a role that doesn't exist — articulating stakes and tensions, lobbying faculty, doing interpretation work. Who: informal coordinators, faculty, cohort.
- Spokesperson bridge protocol — What: intermediary role between "two islands" (subgroups). Does: bidirectional information filtering; preserves trust while translating priorities across groups. Who: spokesperson, subgroup participants.
- Three-person team power design critique — What: challenge to team compositions that omit user research/design (e.g., 2 devs + PM). Does: traces power transfer — decisions about UX, accessibility, and user experience don't disappear when roles are absent; they get absorbed by whoever is closest to implementation. Who: delivery teams, leaders.
- Anarchism and governance reading — What: readings from Ed on anarchy (Anarchic Agreements; Anarchism and the Politics of Technology). Does: theoretical grounding for rule-light governance exploration. Who: researcher.
- Mastodon Trust & Safety role — What: field note on Mastodon hiring a Community Director for Trust & Safety. Does: signals institutional interest in formalising community governance roles. Who: researcher, civic tech community.
- Surveillance-pricing research thread — What: sustained investigation into individualized dynamic pricing harms, framed as AI governance problem. Does: frames pricing opacity as a legibility/fairness issue. Key finding: two people can look at the same product at the same moment and see completely different prices — not because of sales, stock levels, or errors, but because one is shopping on a Mac, living in a particular zip code, or standing in a store's parking lot. Who: consumers, regulators, platforms.
- Algorithmic-pricing watchdog concept — What: system that monitors online retailers, scrapes prices, and detects unusual patterns suggesting personalized or dynamic pricing. Does: generates evidence for enforcement or journalism. Who: watchdogs, researchers.
- Dynamic-pricing persona auditor — What: tool that simulates multiple user personas (different locations, devices, histories) to see whether prices are being adjusted unfairly. Does: experimentally tests discriminatory price adjustment. Who: auditors, policy advocates.
- "Stalker pricing" exposure tool — What: artifact tracking how personal data may influence the price shown (device type, location, browsing history, past purchases, smart home data). Does: converts hidden inference into user-visible signal. Who: users, journalists, regulators.
- Price-history audit stack — What: use of
CamelCamelCamel/Keepa(used in academic studies to audit Amazon's dynamic price changes) as longitudinal evidence substrate. Does: supports empirical claims about market behavior over time. Who: analysts, media, advocacy groups. - Disclosure-law interface translation — What: New York Algorithmic Pricing Disclosure Act (N.Y. Gen. Bus. Law § 349-a, effective November 2025) as an implementation question — companies using consumer-specific data to set prices must clearly inform consumers when prices have been determined by an algorithm. Does: links legal obligation to interface-level notice patterns. Who: firms, compliance teams, regulators.
- NYT dynamic pricing video field note — What: engagement with NYT Opinion piece "Goodbye, Price Tags. Hello, Dynamic Pricing." — discovered via TikTok; led to rabbit hole on surveillance pricing. Does: personalized narrative entry point to a technical governance problem. Who: researcher, general audience.
- Civic Tech Guide algo transparency track — What: homework from Matt Stempeck — algo transparency flagged as "actively evolving area given difficulty of explainability"; audits flagged as important. Does: grounds personal interest in a recognized civic tech category. Who: researcher, mentor.
- 321-project open evaluation system — What: public, iterative ranking process with inspectable algorithms and rationale for every change (awards.newspeak.house / 2025.newspeak.house/awards / GitHub repo). Does: makes evaluation itself politically visible and contestable. Who: awards committee, public.
- Versioned scoring pipeline (V1–V15) — What: full evolutionary record of ranking heuristics from random scoring to social-choice deliberation. Does: reveals methodological drift, learning, and how "good evaluation" is itself a moving target. Who: evaluators, observers.
- Project dossier enrichment pipeline — What: taxonomy (civic tech field guide) + OpenAlex (academic citations) + ProPublica (nonprofit financials) + verification pass. Does: upgrades thin URL/scraped input into structured evidence base with cross-referenced data. Who: evaluation team.
- Civic tech field guide taxonomy — What: structured category system from Matt Stempeck's civic tech guide (civictech.guide/categories). Does: provides shared vocabulary for field classification. Who: evaluators, researchers.
- Values-data-facilitation triad — What: explicit architecture treating evaluation design as having three distinct components (data quality/verification, values articulation, facilitation/application method). Does: operationalizes political values as system components; prevents collapsing "good data + AI" into a fake neutral answer. Who: committee, audience.
- Heterogeneous vs shared-criteria decision models — What: competing governance of scoring — everyone brings own value system and aggregate vs one set of criteria with multiple judges. Does: surfaces consistency/pluralism trade-off; one more consistent, one more pluralistic. Who: juries, analysts.
- ITN/A framework (Gamithra) — What: evaluation framework built by committee member Gamithra reflecting her values, used to systematize scoring and build a multi-jury AI system. Does: an example of values-as-structured-taxonomy. Who: evaluation team.
- Project Mirror evaluator-estimator workflow — What: infer synthetic evaluator constitutions from public records; build agents per person; run scoring and aggregation. Does: tests evaluative stability across inference/scoring/aggregation/deliberation stages. Who: cohort, synthetic agents.
- Constitution modifier layer — What: score boosts/penalties encoded per agent (community ownership, VC reduction, under-resourced civic context, inclusive developer communities, popularity discount). Does: encodes normative assumptions directly into evaluation mechanics in an auditable form. Who: system designers, auditors.
- Popularity discount mechanism — What: specific correction for LLM familiarity bias — well-known projects exist heavily in LLM training data and might score higher simply because the model "knows" them better. Does: counterbalances representation skew in model priors; improves integrity of results. Who: evaluator-engineers.
- Aggregation and social-choice experiments — What: pairwise comparison, averaging, Borda count deliberation with strategic reasoning available to each agent. Does: tests winner stability and surfaces whether results hold under pressure. Winner (Liquid Feedback) was stable — 18-point lead, 13 deliberation arguments, no revision. Who: synthetic jury, analysts.
- People's Choice criteria pipeline — What: audience submits values via QR form; agent generates a ranking of shortlisted projects based on those values; result announced live. Does: turns evaluation into participatory experience; makes values pluralism visible in real time. Who: event participants, organizers.
- One-shot UI vibe-engineering experience — What: used a prompt to generate a full ranking UI for the awards in one shot — site was fully functional, cool, but overwhelming with no clear user journey. Does: raises question of what it means to "want to build" something vs having it one-shotted; first time feeling simultaneously impressed and overwhelmed by agentic results. Who: researcher.
- Synthetic agents for team tasks — What: manager reaction to Fatima describing synthetic agents — "please build some for the team." Does: signals practical demand for this pattern beyond fellowship context. Who: engineering teams.
- Sparkle Border Authority ritual system — What: border-style kiosk flow (4-char code entry, identity confirmation, purpose/declaration forms, screening, decision, visa print, checkpoint handoff, arrival tracking). Does: operationalizes bureaucracy as live interactive event infrastructure at party scale. Who: guests, checkpoint staff, admins.
- Printed A6 visa artifact system — What: printable A6 "visa stickers" with visa class and privilege semantics. Does: materializes procedural state in physical collectible form; makes "approved" feel real. Who: participants, checkpoint operators.
- Visa classes and privilege system — What: categorization of entrants into classes with associated privileges. Does: reproduces bureaucratic classification while making it theatrical and collectible. Who: guests, admins.
- Admin override and reprint pathways — What: exception-handling tools inside the ritual flow (manual visa creation, visitor signup, checkpoint assistance). Does: preserves operational continuity under edge cases without breaking the experience. Who: operators/admins.
- Live immigration stats dashboard — What: runtime counters — entries, approvals, rejections, document distribution. Does: supports staff coordination while staging a legible "state apparatus" for observers. Who: staff, participants.
- Playful bureaucratic language layer — What: tonal overlay atop formal procedures — "sparkle compliance," "diplomatic glitter," "excellent vibes." Does: shifts affect from intimidation/authority to participation and delight; makes compliance feel like joining a story world. Who: all participants.
- Secondary screening logic — What: rules-based pathway for edge cases requiring additional review. Does: preserves procedural realism while keeping the experience moving. Who: checkpoint staff.
- Browser storage state persistence — What: state persisted in browser storage for continuity. Does: makes the system operationally robust for a multi-hour live event. Who: operators.
- Matrix-to-blog ingestion pipeline concept — What: app that takes newsletters and emails to a designated email and pushes them to a Matrix channel, which then publishes to a blog. Does: turns private inputs into publishable field-note stream; creates repeatable research publishing infrastructure. Who: researcher, bot, public readers.
- Public field-notes channel contract — What: explicit norm that anything posted in the field-notes channel may become public. Does: aligns discussion behavior with publication intent; makes the channel a live research artifact. Who: researcher, channel participants.
- Small, high-density event strategy — What: from Matt Stempeck mentoring — host your own small events; keep description narrow and niche to attract the right crowd; a tight group of six deeply interested people beats a larger superficial panel. Does: supports targeted network formation and research depth. Who: researcher, invited practitioners.
- Hallway-track encounter model — What: mention of Dan from CFA as a model for organic depth encounters. Does: names a specific mode of valuable connection (not panels, not formal talks). Who: researcher, network.
- Calendar as network curation — What: compiling others' relevant events under shared Sparkle Bureaucracy frame; no permission needed. Does: creates lightweight convening infrastructure; institutionalises personal interests to give them more weight. Who: researcher, wider network.
- Portfolio vehicle framing — What: umbrella for multiple prototypes including collaborations with others. Does: accumulates legitimacy and recognizability without committing to one canonical project; alliances with people doing cool stuff. Who: researcher, allies, sponsors.
- Brain trust / feedback session format — What: loosely structured early-stage feedback sessions where people show up to help shape something. Does: generates depth and investment from participants. Who: researcher, collaborators.
- Pipeline / ritual flow pattern — repeated staged flows (
intake → process → decision → artifact → handoff) across border simulation, evaluation systems, and governance experiments. The structure is the same; intent differs. - Legibility artifact pattern — logs, dashboards, labels, constitutions, disclosures, and printed permits are used as both operational surfaces and social coordination devices. Making the process visible is not just transparency — it is the product.
- Evaluation pluralism pattern — same dataset repeatedly re-evaluated via different criteria, aggregation rules, and value sets. Outputs treated as contingent, not absolute. The variation is the information.
- Governance-as-interface pattern — policy and governance questions repeatedly instantiated as product mechanics: forms, checkpoints, role permissions, dashboard visibility, disclosure text. Rules become buttons.
- Role concretization pattern — movement from abstract group identity toward named roles and duties to resolve accountability ambiguity. Committees diffuse responsibility; roles concentrate it usefully.
- Shadow-labor recognition pattern — invisible maintenance, translation, and lobbying work repeatedly identified as structural, not incidental. Informal governance doesn't disappear when formal governance is absent — it redistributes.
- Instrumentation-first safety pattern — repeated concern that benchmark/model evals under-detect system-level failures unless end-to-end traces are captured. Governance of AI systems requires operational observability, not only capability benchmarks.
- Participatory evaluation pattern — audiences and stakeholders invited to submit criteria and inspect methods, not only consume results. Evaluation legitimacy comes from participation, not only accuracy.
- Affective reframing pattern — serious systems wrapped in playful/theatrical presentation to reduce defensiveness while preserving procedural logic. "Sparkle" is not decoration; it changes what people are able to engage with.
- Publishing-as-method pattern — field notes, public repos, transparent scripts, and open algorithms are used as active research infrastructure, not post-hoc documentation. The process is the publication.
- Constitutionalized evaluation pattern — decision systems become inspectable constitutions (criteria + modifiers + aggregation rules), making "how judgment is made" a first-class design object rather than a black box.
-
Visibility vs opacity
- Side A: explicit logs, disclosures, constitutions, public algorithms, inspectable ranking code.
- Side B: hidden inference systems, opaque price personalization, black-box evaluator behavior, LLM training data dominance.
- Appears in: surveillance-pricing notes, Project Mirror method transparency, disclosure-law translation, V1–V15 evolution.
-
Playfulness vs enforcement
- Side A: participatory ritual, aesthetic delight, low-threat framing, "excellent vibes."
- Side B: gating, rejection paths, checkpoint authority, exclusion capability, visa classes.
- Appears in: Sparkle Border Authority flow, governance module rule experiments, permission-slip dynamics.
-
Ritual richness vs operational efficiency
- Side A: process steps create legibility, meaning, and shared experience.
- Side B: heavy artifacts, logs, and interfaces can create friction, burden, and brittleness. Assessment log dragging UX; massive file loaded on page-load causing crashes.
- Appears in: assessment/logging concerns, multi-step evaluation pipelines, border kiosk operational seams.
-
Individual role clarity vs collective process legitimacy
- Side A: named roles improve responsibility and execution; make accountability explicit.
- Side B: committees/collectives distribute voice and may resist concentration of authority.
- Appears in: roles-over-committees thread, governance module legitimacy field note, shadow governance articulation.
-
Evaluation formalism vs lived behavior
- Side A: structured scoring, constitutions, social choice rules, Borda stability.
- Side B: real participants contest legitimacy; interpretive drift; strategic behavior; "why would we follow the rules?"
- Appears in: V1–V15 evolution, synthetic jury deliberation, audience criteria input, governance module.
-
Governance as policy text vs governance as runtime interface
- Side A: rights frameworks, charters, legal disclosure obligations, signed mandates.
- Side B: concrete UX mechanisms that actually shape compliance and power in practice.
- Appears in: permission-slip, disclosure notices, checkpoints, role gating, Sparkle Border Authority admin overrides.
-
Opt-in ethos vs coercive fallback
- Side A: participatory, peer-governed legitimacy; everyone chose to be here.
- Side B: implicit or explicit reliance on exclusion/compliance enforcement; "forms of exclusion" as backstop.
- Appears in: governance module legitimacy notes and border simulation mechanics.
-
Open participation narrative vs unequal permission capacity
- Side A: "no one needs permission" — empowering design intention.
- Side B: some actors still require formal legitimacy scaffolds to act; invisible norms create unequal actionability.
- Appears in: permission-in-free-systems field note and shadow governance reflections.
-
Synthetic representation vs authentic voice
- Side A: synthetic agents enable scale, pluralism, stable testing.
- Side B: synthetic representations are bittersweet; question of what it means to have your values inferred and simulated; discomfort with misrepresentation.
- Appears in: Project Mirror, Asil's reflection on seeing an AI agent built from her public record.
-
Seriousness vs accessibility trade-off (false)
- Observation: this tension is often framed as a real trade-off but may be a false dilemma. Sparkle Bureaucracy asserts that seriousness and accessibility can coexist; "without bureaucracy, it might seem too silly." The 80/20 formulation holds both.
-
Where bureaucratic ritual is preserved
- Structured forms, declarations, screening, approvals/rejections, permits, checkpoints.
- Evaluation analogs: criteria, constitutions, ranking protocols, deliberation rounds, audit trails.
- Governance analogs: charters, assigned roles, permission artifacts, retrospectives.
-
Where ritual is subverted
- Border-control grammar redirected toward party-world immersion.
- Awards ranking reframed from "neutral expert judgment" to openly political method experimentation.
- Governance experimentation reframed as sandboxed social inquiry, not final institutional law.
-
Where ritual is aestheticized
- Visual/tone layer around procedural steps: sparkly language, collectible visa artifacts, staged checkpoints.
- Evaluation dashboards/plots used as interpretive theatre, not only analytic backend.
- Public scripts/slides turn methodological internals into performative civic artifacts.
- Project Mirror constitutions and ranking dashboards make hidden evaluative assumptions legible.
-
Where ritual becomes participatory theatre
- Guests enact border crossing with staff in authority roles.
- Audience submits criteria and co-produces rankings in People's Choice flow.
- Cohort members encounter synthetic versions of their evaluative selves and respond socially.
- Awards night audience does ranking exercise before seeing how committee did it.
-
Where systems remain traditional bureaucracy
- Control-oriented elements persist: classification, gating, rejection, compliance routing, exception handling authority.
- Method-heavy evaluation stacks can reproduce administrative burden through complexity.
- Legitimacy can remain dependent on role power and social enforcement, not only design.
-
Where transition toward sparkle bureaucracy appears
- Intent shifts from control/compliance to engagement/interpretability while retaining process skeleton.
- Formal mechanisms are retained but made socially legible, playful, and discussable.
- Strongest in lived interaction prototypes (Sparkle Border Authority) and increasingly present in evaluative systems (Project Mirror + open awards process).
-
Cluster A — System behavior, safety, and observability
- RAG/agent/tool failures, control illusion, hidden attack surfaces, instrumentation requirements, eval blind spots, agentic workflow risk, AI-coded code vulnerabilities.
-
Cluster B — Governance mechanics and legitimacy
- Roles vs committees, temporary authority experiments, permission scaffolds, shadow governance, team power design, governance module legitimacy question.
-
Cluster C — Market accountability and rights translation
- Surveillance pricing, persona-based price audits, disclosure-law implementation, price-history evidence tools, algo transparency.
-
Cluster D — Evaluation constitutions and collective judgment
- Public ranking pipelines, values formalization, synthetic evaluators, aggregation method experiments, social-choice deliberation, Project Mirror.
-
Cluster E — Bureaucratic experience design
- Ritual interfaces, checkpoint choreography, documents-as-props, dashboard dramaturgy, playful tone overlays, Sparkle Border Authority.
-
Cluster F — Research/network infrastructure
- Matrix publishing pipeline, field-note publicness, small event formats, calendar curation, portfolio vehicle strategy.
- A ↔ D: evaluation methods for AI systems and evaluation methods for civic projects share instrumentation and legitimacy questions.
- B ↔ E: governance legitimacy is repeatedly performed through ritual interfaces, not just debated abstractly.
- C ↔ A: surveillance-pricing accountability depends on technical observability architectures.
- D ↔ E: Project Mirror and awards workflows turn evaluation procedure into staged civic experience.
- F ↔ all: publishing and convening infrastructure shape what gets tested, seen, and trusted.
- Desire for binding structures coexists with skepticism toward coercive enforcement.
- Desire for openness coexists with need for explicit permission channels.
- Desire for playful systems coexists with requirement for serious accountability outcomes.
- Desire for methodological rigor coexists with discomfort about synthetic representation of people.
-
Constitution inference from public record — Unusually concrete and socially provocative. Turns biography and public output into machine-readable evaluative logic. Raises live questions about consent, accuracy, and what it feels like to have your values inferred.
-
Popularity discount against model familiarity bias — Fast test surface with high epistemic value. Checks whether rankings are reputation artifacts. Replicable across any evaluation system using LLMs.
-
Aggregation volatility tests (pairwise/Borda/averaging) — Quick to run on any existing dataset. Exposes how "winner" depends on procedural rule choice, not only on project quality.
-
Persona-based dynamic-pricing auditor — Behavior-exposing: can generate empirical evidence of personalized pricing asymmetry. Could be released as a civic audit tool.
-
Permission-slip in a no-permission culture — Socially novel artifact. Formalizes actionability where norms are implicit and unevenly distributed. The tension between the designed "no permission needed" system and the lived "I do need permission" experience is itself research-worthy.
-
Shadow governance as explicit role category — Makes invisible coordination labor a first-class governance object. Naming it changes what's claimable, rewardable, and designable.
-
Ritualized border simulation with live exception handling — Experientially novel. Bureaucracy operated as theatrical infrastructure while retaining control logic and operational seams (overrides, reprints, secondary screening).
-
Audience criteria → live ranking loop — Participatory mechanism that reveals value pluralism in real time. Low-tech version (QR form) worked in production.
-
Tiny representational semantics in charts/rankings — Small implementation details (tie semantics, display choices, duplicate labels) can materially alter perceived legitimacy of results.
-
Publishing pipeline as method, not dissemination — Research operations (channel → bot → blog) function as epistemic infrastructure and accountability mechanism. The pipeline is the commitment to openness.
-
Social choice with strategic agents — Running Borda count where each agent knows its honest standings and can reason strategically. Stability under this condition is a strong signal about result legitimacy.
-
"All 18 of my cohort were with me on one screen" — An emotionally and intellectually significant moment. Points toward human-in-the-loop synthetic collaboration as the next design challenge, not just automation.