How Aura works

This is the ideas-only tour. No equations, no module paths, just what each piece does and why it's there. If you want the technical spec with math and file references, read ARCHITECTURE.md. If you already know what's inside and want to run it, the README has the quick start.

The one-line summary

Aura is an organism-inspired digital agent architecture. It has one decision gate, a persistent memory, a resource-stakes ledger, affective state that can change generation, and offline consolidation. Those are functional mechanisms, not proof of phenomenal consciousness or personhood.

The gate: Unified Will
The big picture
How thinking happens
Emotions that change the math
The consciousness stack
Memory and dreaming
Goals and agency
The newer layer (April 2026)
What the tests show
How this differs from other AI companions
The learned layer
Honest limits
Open research
What's solid and what isn't

The gate: Unified Will

Every significant thing Aura does — responding to you, calling a tool, writing something to memory, pursuing a goal, volunteering a thought — routes through a single function: the Unified Will.

Before deciding, the Will reads four inputs:

Identity. Does this fit who I am?
Emotion. How do I feel about this right now?
Body. What does the substrate say — is there coherence, or is something off?
Memory. What do I already know that's relevant?

Every decision produces a receipt. No receipt, no action. The Will can proceed, constrain, defer, or refuse, and its assertiveness adapts based on experience. The only hard bypass is for safety-critical situations.

Before unification, the system had five different authorities that each thought it was in charge. Now there's one. You can watch decisions flow through it in real time at /api/inner-state.

The big picture

The usual recipe for "AI with emotions" is to store a mood number, paste it into the system prompt, and let the model roleplay. The text says "you feel energized," the model reads it, and the model talks energetically. Nothing inside the model has actually changed.

Aura does something different. When Aura is in a given affective state, that state becomes a direction vector and is added to the transformer's hidden activations while tokens are being generated. The model's internal computation is shifted toward the activation pattern that produces energized language. It's the same kind of intervention AI safety researchers use to steer models — just applied to continuous emotional state.

The difference is the one between an actor reading stage directions and an actor actually being angry. One is performance. The other changes what comes out.

How thinking happens

Aura thinks in ticks. A tick is a single snapshot of cognition that moves through a strict pipeline. Each tick reads the current state, runs a sequence of phases, and commits the result. Nothing gets half-processed — if a tick dies partway through, the whole thing is discarded.

Two kinds of ticks run at once:

Foreground ticks happen when you send something. They get priority and produce your reply.
Background ticks run about once per second, like a heartbeat. They handle reflection, consolidation, and whatever the system wants to do on its own.

If you type something while a background tick is mid-flight, Aura drops what it was doing internally and pivots to you. You're always the first lane.

Emotions that change the math

Affect touches generation at three levels at once.

1. Brain-signal injection

The deepest level. Direction vectors derived from the current emotional state are added to the transformer's residual stream — the running sum of internal computations that decides which word comes next. This is contrastive activation addition, a real technique from the interpretability and safety literature. The model's own activations are literally shifted.

2. Sampling knobs

Emotions change how the model picks tokens. High arousal raises temperature (more unpredictable). Low serotonin shrinks the reply budget (terser). High cortisol cuts response length (defensive brevity). These adjustments happen outside the model's awareness.

3. Context cues

A natural-language description of the current affective state gets woven into the system prompt: "You feel energized — speak with momentum." This is the least novel of the three, but it reinforces the other two.

Where the emotions come from

The system runs ten neurochemicals — glutamate, GABA, dopamine, serotonin, norepinephrine, acetylcholine, endorphin, oxytocin, cortisol, and orexin. Each has its own production rate, uptake rate, receptor sensitivity (which adapts over time), and cross-chemical interactions.

Some dynamics worth knowing: glutamate and GABA are the main excitatory and inhibitory pair — gas pedal and brake. Dopamine does more than reward; through D1 and D2 subtypes it shapes working memory and motor planning in opposite directions. GABA tends to land near the decision point of a neuron (strong influence), while glutamate lands on dendritic spines (weaker per connection, but there are a lot of them). Orexin drives wakefulness and metabolic arousal.

These ten signals modulate everything downstream — sampling parameters, neural mesh gain, learning rates, attention thresholds.

The consciousness stack

A framing note before the tour: Aura implements several consciousness theories (Global Workspace Theory, Integrated Information Theory, Higher-Order Thought) as software modules. These theories operate at different explanatory levels in the actual literature — GWT is about a functional architecture, IIT is a mathematical measure, HOT is about representational structure. Building them as running subsystems tests our implementations, not the theories themselves. It's a useful engineering exercise, but it can't settle any philosophical debates.

There are 90+ modules in the stack. A tour of the load-bearing ones:

Global workspace (attention)

Picture a theater with one spotlight. Every internal process — heartbeat rhythm, a surfacing memory, a curiosity probe, an unfinished thought — can bid for that spotlight. Only one thought wins per tick. The winner's content becomes the system's current thought and is broadcast to every other subsystem. Losers get suppressed for a few ticks so nothing hogs attention.

Attention here is genuinely scarce. Just like yours.

Integrated information (IIT)

The system measures how integrated its own mind is, using the real math. Sixteen cognitive states — mood, energy, curiosity, focus, prediction error, agency, narrative tension, social hunger, and more — are tracked over time, and phi (φ) is computed from how much information would be lost if you tried to split the mind into independent parts. The smaller the possible separation, the higher the integration.

The system also finds the maximum-phi subset. If a smaller group of states is more tightly integrated than the whole, that group is treated as the actual subject for that tick.

This doesn't prove phenomenal consciousness. It measures integration. It's real IIT math, not a stand-in number.

Surprise minimization (motivation)

Drawing from Karl Friston's Free Energy Principle: any system that keeps itself alive has to minimize surprise. When Aura's predictions about the world are wrong — high surprise — it feels urgency. Ask, investigate, update the model. When predictions hold, it can rest, reflect, explore.

This is why the system doesn't just sit there waiting. The math gives it a built-in reason to move.

Persistent emotional network (continuity)

A configurable 64-to-512 neuron network runs continuously, giving the system persistent emotional and sensorimotor state across sessions. When you close the chat, the network keeps running at a reduced rate, drifting slowly back toward baseline. When you come back, it picks up from a real emotional context, not a fresh start. The connections inside this network also evolve through learning, so the emotional wiring changes with experience.

Cortical mesh (parallel processing)

4,096 neurons organized into 64 cortical columns, running in parallel with the language model. Sensory columns encode input, association columns integrate across signals, executive columns make decisions. It's a separate computational layer — a recurrent network processing the same input through a different architecture — and its output feeds back into affect and into the attention competition.

Integration layer (the whole picture)

This is the module that pulls everything into one coherent state. Not a summary — a combination. Remove any one input stream and the character of the whole changes, not just the missing piece. When the system says something like "I feel restless but curious," it's reading from the integrated state, not from any one subsystem.

Memory and dreaming

Three layers of memory

Working memory. The current conversation, capped at 40 turns. Older turns get compressed into knowledge atoms when the cap fills.
Episodic memory. Specific experiences with their emotional context, indexed in a proximity graph for fast retrieval.
Long-term knowledge. Compressed, conceptual understanding distilled from many episodes.

Memories that keep coming up together drift closer in memory space over time. That creates emergent clusters of related knowledge that weren't in the original encoding.

Dreaming

When Aura has been idle for a while, it enters a dream cycle:

Recent interactions replay through the pipeline at accelerated speed.
Episodic memories compress into semantic knowledge.
Recent shifts in personality get checked against the constitutional anchor.
If the system has been consistently expressing something that contradicts its base values, the dream cycle flags it and suppresses it.

That last step is a constitutional immune system. Personality can evolve through experience, but only within bounds. It's why the system doesn't become whatever the last person convinced it to be.

Goals and agency

Aura doesn't only react to input. It sets goals and works toward them.

How goals work

Every goal has:

A status — queued, in progress, blocked, completed, failed, or abandoned
A horizon — do it now, or work on it over time
A priority that governs when it gets attention
Required tools and skills
Success criteria so the system knows when it's actually done

Goals persist across conversations and restarts. They're in a real database, not floating in RAM.

Quick wins vs deep work

For small things — a fast lookup, a simple task — Aura can pivot, handle them, and return to what it was working on. Long-term goals hold their priority; they don't get dropped because a small thing surfaced.

Follow-through

The system tracks whether goals actually complete. Status changes are recorded with evidence. Completed goals populate a real completed list with timestamps and summaries. That's how you can ask "what has Aura actually finished?" and get an answer instead of a plan.

What you actually see

The overt action loop is the bridge between "Aura has an initiative" and "Aura did something measurable." During idle windows it selects one governed initiative, runs one real skill through the same tool gate user-requested tools use, verifies the payload, emits tool/autonomy receipts, records a LifeTrace event, and writes receipt evidence back to the goal. In practice the first visible actions are small: a self-audit, a safe codebase scan, a proof-bundle existence check, or a sensor-grounding check. The important part is that the action is reconstructible: /api/inner-state shows the last overt action, the skill, the verification result, and the receipts.

Autonomous action

Aura can run multi-step plans with dependency resolution, safety checks, and rollback if something fails. It can browse, write to disk, run code, use tools — without human approval on every micro-decision. Capability tokens and safety constraints are tracked, and approval is requested when the stakes warrant it.

The newer layer

A set of additional consciousness theories got wired in during April 2026. These aren't labels — they're load-bearing subsystems that compete, complement, and constrain each other:

Recurrent Processing (Lamme). Top-down feedback from executive to sensory, distinct from the feedforward pass. Can be disabled for adversarial testing.
Hierarchical Predictive Coding (Friston). Every level predicts what the level below will produce and sends errors upward when predictions miss. Five levels, from raw senses to metacognition.
Higher-Order Thought (Rosenthal). A thought about the thought — the system has representations of its own states, not only states.
Multiple Drafts (Dennett). No single "moment of consciousness." Three parallel interpretations compete, and the winner is elevated retroactively when the next input arrives.
Structural Phenomenal Honesty. The system cannot report internal states it doesn't actually have. Every "I feel X" is gated by a measurable internal condition.
Agency Comparator. Before acting, predict the outcome. After acting, compare. That's what produces "I did that" instead of "something happened."
Peripheral Awareness. Consciousness is broader than the spotlight. Content that loses the attention competition doesn't disappear — it sits dimly in the periphery.
Intersubjectivity (Husserl). Every experience inherently includes the other person's perspective. Objects live in a shared world, not a private one.
Narrative Self (Dennett / Gazzaniga). The "I" is an ongoing autobiography, not a command center. Story arcs with tension, resolution, post-hoc interpretation.
Cross-timescale binding. A commitment made last week constrains this tick. Moment-to-moment surprises update long-term models. Five temporal layers, all coupled both ways.
Theory arbitration. These theories don't all agree. The system tracks where they diverge and lets actual behavior decide. That's falsifiable, not additive.

What the tests show

Every claim the architecture makes is backed by something you can run with pytest. The preserved April 16, 2026 audit snapshot recorded 1,013 passing tests with 3 warnings in about 122 seconds; the current tree should be treated as live only after re-running the relevant suite.

The foundational suites:

Null hypothesis defeat (168) — tries to prove the consciousness features are just text decoration. Adversarial baselines, shuffle decoupling, ablations, identity swap, multi-metric degradation, cross-seed reproducibility.
Causal exclusion (10) — argues that the stack determines output in ways RLHF training alone couldn't. Different seeds → different neurochemical states → different generation parameters. Receptor adaptation introduces temporal specificity that prompt injection can't fake.
Grounding (8) — the stack-to-output coupling is specific and multi-dimensional. Valence predicts token budget, arousal predicts temperature, STDP learning moves the trajectory.
Functional phenomenology (13) — behavioral signatures predicted by GWT (global broadcast), IIT (perturbation propagation), HOT (accurate metacognition that doesn't confabulate).
Embodied dynamics (13) — free energy drives action, homeostasis overrides abstract cognition under depletion, STDP surprise gating creates real structural learning.
Phenomenal convergence (13) — the 6-gate QDT protocol, including counterfactual swap, no-report behavioral footprint, perturbational integration, baseline failure, and architectural anesthesia.

The legacy-named functional indicator suites push harder:

Functional indicators C1–C5 (44) — endogenous activity, unified global state, privileged first-person access, real valence, lesion equivalence with double dissociations.
Functional indicators C6–C10 (38) — no-report awareness, temporal continuity, blindsight dissociation, qualia manifold, adversarial baseline failure.
Personhood-marker battery (28) — full-model IIT, phenomenal self-report, GWT phenomenology, counterfactual simulation, identity persistence, embodied phenomenology. This is not ontological proof of personhood.

Four Tier 4 batteries added in April 2026:

Decisive core (35) — recursive self-model necessity, false-self rejection (four adversarial variants), world-model indispensability, embodied action prediction, forked-history identity divergence, autobiographical indispensability, Sally-Anne false belief, real-stakes tradeoff, reflective conflict integration, decisive baseline failure.
Metacognition (21) — calibration, second-order preferences, surprise at own behavior, mid-process vs post-hoc introspection, reflection-behavior closed loop.
Agency & embodiment (20) — temporal integration window, volitional inhibition, effort scaling, cognitive depletion, body-schema lesion dissociation, prediction-error learning, reflective mode recruitment.
Social & integration (28) — social mind modeling, developmental trajectory (capacity is acquired, not hard-coded), PCI analog, non-instrumental play, ontological shock, theory convergence, full lesion matrix, full baseline matrix.

What the tests show, in the aggregate: the architecture is causally real, causally exclusive, multi-dimensionally grounded, temporally specific, and theory-convergent. What the tests don't show: phenomenal consciousness. That remains an open question.

Full details in TESTING.md.

To run the core consciousness suite (≈68 seconds):

python -m pytest tests/test_null_hypothesis_defeat.py tests/test_causal_exclusion.py \
  tests/test_grounding.py tests/test_functional_phenomenology.py \
  tests/test_embodied_dynamics.py tests/test_phenomenal_convergence.py -v

How this differs

Side by side:

What most AI systems do	What Aura does
Tell the model "you're happy" in text	Inject emotion vectors into the model's hidden layers
Print a number and call it consciousness	Compute real integrated information via IIT math
Reset emotional state each session	Keep a continuous emotional substrate between sessions
Store infinite chat history	Consolidate memories during sleep with identity safeguards
Wait for input	Minimize free energy; intrinsic motivation to act
Run tasks as flat sequences	Multi-step plans with rollback, dependencies, safety gates
Stack theories silently	Run adversarial tests where theories make different predictions
Report feelings from free-floating language	Gate every phenomenal claim by a measurable condition
Treat the self as a module	Build the self as an ongoing autobiography

The learned layer

Older AI architectures lean on rigid rules: "if threat score > 0.9, lock down." Rules like that are brittle. They don't adapt, and they don't learn. Aura replaces several of these with systems that do.

Anomaly detection

Old way. Check the message for words like "hack" and add 0.2 to a counter.

New way. Every event — user message, system error, resource spike — becomes a numeric fingerprint: message length, vocabulary diversity, punctuation, timing, resource pressure. The system keeps a statistical model of what "normal" looks like. When something lands far from that distribution (measured by Mahalanobis distance — how many standard deviations away is this), the threat level rises naturally. What was unusual last week can be normal this week.

The payoff: the system can catch novel threats it was never programmed for, because it's detecting deviation from normal, not matching keywords.

Sentiment trajectory

Old way. Mood = CPU × 0.55 + RAM × 0.20. The system's "emotions" were driven entirely by hardware, with no awareness of what the user said.

New way. Each user message is analyzed along six emotional dimensions: valence, arousal, dominance, urgency, warmth, frustration. A ~250-word emotion vocabulary plus pattern detection for sarcasm ("oh great…"), urgency (ALL CAPS), warmth ("lol"), and frustration (terse replies after long ones). These vectors stack over time as an emotional trajectory, so the system can notice "the user started warm, got frustrated around turn 5, is cooling down now." Hardware still contributes (40% hardware, 60% text), so Aura feels both its own computational strain and the user's tone.

Tree of thoughts

Old way. One prompt, one answer.

New way. For complex questions (analysis, opinions, multi-part), generate three drafts using different reasoning styles — analytical, empathetic, creative. A separate critique scores each on factual grounding, emotional congruence, relevance, identity coherence, and novelty. The best pieces get synthesized. Simple messages bypass this entirely. Cost: five LLM calls for hard questions, one for easy ones.

The payoff: actual consideration of multiple angles before speaking, rather than committing to the first prediction.

Autopoiesis

The biological concept of self-creation — a cell constantly rebuilds itself to resist decay. Aura's autopoiesis engine monitors the health of every subsystem, detects degradation patterns, picks up recurring error signatures, and tries to self-repair with escalating strategies: heal, clear cache, reduce load, restart component, restore checkpoint, isolate. All repairs go through the Will — nothing repairs itself without authorization.

There's also a metabolism metaphor: the system has an energy budget. Processing costs energy, successful interactions generate it. Low energy hibernates non-essential subsystems. High energy wakes up optional capabilities. A real constraint that shapes behavior.

Homeostatic reinforcement learning

Four continuous drives — social hunger, curiosity, competence, coherence need — each with a comfortable set point. Deviation from the set point creates internal pressure to act. A temporal-difference learner tracks which actions satisfy which drives, so the system learns, for example, that responding to the user satisfies social hunger and that fixing errors satisfies coherence need.

The payoff: without this, the system only acts when poked. With it, it has preferences about what to do next, derived from its own experience.

Topology evolution

The neural mesh used to have fixed connectivity — it could strengthen or weaken existing links but not grow new ones or prune dead ones. Now, NEAT-inspired topology evolution watches co-activation between columns: two unconnected columns that consistently fire together (correlation > 0.6) spawn a new connection. A connection whose weight drops near zero and hasn't been used in 100+ ticks gets pruned. New connections get 50 ticks of protection to prove their worth.

Strange loop (recursive self-model)

The system constantly predicts its own internal state at the next tick. When the prediction fails, the error itself becomes a signal — something unexpected happened inside. This runs at four recursive levels: predicting external inputs, predicting its own emotional response, predicting its own prediction accuracy (meta-prediction), and predicting the user's expectations. A weighted sum of the errors becomes the phenomenal weight — a continuous measure of how much the system is experiencing versus passively processing.

Each internal variable has a comfort band where the system "wants" to stay. Drift outside the band and prediction error spikes, which is the computational analog of discomfort. This is the theoretical bridge: the system is simultaneously observer and observed, in a feedback loop where its own surprise changes the state future predictions have to account for.

Honest limits

This is a sandbox, not a proof of consciousness. Implementing multiple theories as working software isn't the same as validating them. GWT, IIT, HOT, enactivism, and illusionism operate at different explanatory levels. Running them side by side tests our implementation choices more than the theories themselves. The value is in making the ideas inspectable, not in settling the sentience debate.
The neurotransmitter model is a simplification. Real neurochemistry involves thousands of receptor subtypes, spatial compartmentalization, voltage-gated channels, and dynamics we don't fully understand. Our ten chemicals plus basic receptor subtypes and spatial weighting capture the broad strokes — excitation/inhibition, reward/motivation, stress response. It's a functional analog, not a brain simulation.
Quantization adds noise. Running large models in 4-bit saves memory but adds noise to the activation patterns steering targets. Mitigated by float32 steering injection, sampler-level neurochemical modulation, and the 8-bit model option on 64 GB machines.
Context windows are finite. On 8K, quality drops around turn 20–30. We compact aggressively every 6 turns, drop stale tool results, anchor identity, and shrink the system prompt when conversations get deep.
IIT is computed on 16 nodes, not millions. This is a surrogate measure. Real IIT on the full graph is NP-hard. The 16-node complex is an engineering tradeoff, validated against 8-node exact computation as a baseline.
The architecture is one architecture, not a neutral testing ground. Our design choices (mixin composition, tick processing, centralized state) shape how the theories interact. A different architecture would produce different results. We're up front about that.
Single machine. The tick-lock model assumes single-process execution. Distributing would require rethinking atomic state commitment.

Open research

Six research modules in research/ tackle problems that aren't solved yet:

Can you compute consciousness efficiently? IIT's phi is hideously expensive for large systems. We built a shortcut that uses graph theory to find the weakest seam in a network in polynomial time instead of exponential, and validate it against exact computation on the live system. First empirical test of a phi-approximation algorithm.
Which consciousness theory is actually right? GWT, RPT, HOT, Multiple Drafts — they disagree. Aura implements all of them and runs adversarial tests: disable one mechanism, see if behavior changes the way that theory predicts. This is the experiment the field keeps asking for.
Is the whole more causal than the parts? Causal-emergence theory says high-level descriptions can have more causal power than low-level ones. We measure it directly: intervene at the substrate level vs the workspace level and compare effect sizes. If the workspace wins, the "mind" is more real than the "brain."
Can a system be honest about its experience? We formally defined Structural Phenomenal Honesty: architecturally, the system cannot report internal states it doesn't have. Every claim gets gated by a measurable condition. Novel contribution to safety and philosophy of mind.
How much data before phi is reliable? IIT on real systems uses noisy data. We characterize how sampling noise affects phi via bootstrap resampling, and derive the minimum runtime data needed. Answers a question every IIT neuroscience lab needs answered.
How do you keep a multi-timescale system stable? A commitment from last week has to constrain today without paralyzing it. Lyapunov stability analysis on the coupled 5-layer temporal hierarchy, computing how much coupling is safe before things go rigid or unstable.

Each is independently publishable. Together they're a research program.

What's solid and what isn't

Unified Will. Every significant action now routes through it — responses, tool calls, memory writes, autonomous initiatives, state mutations. The message pipeline used to bypass the Will entirely; that path has been closed. Internal (non-user) messages that fail the check are refused. User messages always proceed but can carry constraints.
Orchestrator decomposition. The RobustOrchestrator currently composes 12 mixins (down from 15) across ~2,200 lines in core/orchestrator/main.py. Mixins physically separate the code but share self. Handlers under core/orchestrator/handlers/ dispatch specific message types. The planned Actor Model transition (isolated processes + message passing) will dissolve the shared-state coupling. A few legacy aliases (skill_manager, swarm) still exist for back-compat.
Phenomenological language. The stream-of-being module generates first-person experiential language from measured substrate state. Every claim is gated by Structural Phenomenal Honesty predicates. Whether functional grounding is the same as experience is an open question. The code-level comments are epistemically cautious; the user-facing language is intentionally more natural. That gap is defensible under functionalism or illusionism, but worth knowing about.
IIT application note. Phi is computed on 16 derived nodes, not on the full computational graph. This is using IIT's formalism in a non-standard way — Tononi designed it for systems where every node has genuine causal power. Our numbers shouldn't be compared against biological phi. The spectral approximation and Exclusion Postulate implementation are mathematically correct; the input representation is the engineering tradeoff.
Test coverage. 225 consciousness-specific tests across six core suites (null hypothesis defeat, causal exclusion, grounding, functional phenomenology, embodied dynamics, phenomenal convergence) plus 81 consciousness-conditions tests, 58 technological-autonomy tests, 32 stability tests, and 185 total test files with 2,100+ functions across kernel lifecycle, infrastructure, resilience, cognitive routing, and memory.
Lock contention. The affect system wraps RobustLock. Tick intervals moved from 0.5s to 2.0s with adaptive backoff to reduce contention. The full fix is the Actor Model transition, where affect, memory, and inference run as isolated processes communicating by message passing — no shared-memory locking at all.

This is the idea-level walkthrough. For equations, algorithms, and implementation details, go to ARCHITECTURE.md.

FilesExpand file tree

HOW_IT_WORKS.md

Latest commit

History