Privacy and Redaction

Dream Mode's trust model depends on nothing sensitive leaving the slave. This page documents exactly what's redacted, how, the audit mechanism, and the privacy guarantees (plus their limits).

What leaves the slave (complete inventory)

When Dream Mode is enabled on a slave, every ~2 hours an envelope pushes to master containing:

Field	Contents	How it's protected
`trajectories[]`	Task descriptions + tool sequences + result snippets	Deep-scrubbed + entropy-audited
`memorySummaries[]`	Agent memory summary text	Deep-scrubbed + entropy-audited
`consumptionFeedback[]`	Trajectory hashes + usage counts	Hashes only, no content

Every string in trajectories + summaries goes through two rounds of scrubbing:

Pattern-based substitution
Shannon-entropy audit (catches what patterns miss)

Only after both rounds does anything leave the slave.

Pattern-based scrubbing

deepScrubForExport() walks every string field and replaces known-secret patterns with [REDACTED].

Patterns stripped

Pattern class	Examples	Match
Anthropic API key	`sk-ant-api03-...`	`/sk-ant-api\d+-[\w-]+/`
OpenAI API key	`sk-...`, `sk-proj-...`	`/sk-[\w-]{48,}/`
Google API key	`AIza...`	`/AIza[\w-]{35}/`
AWS access key	`AKIA...`, `ASIA...`	`/A(KIA\|SIA)[A-Z0-9]{16}/`
GitHub PAT	`ghp_...`, `gho_...`, `ghs_...`	`/gh[osu]_[\w]{36}/`
GitHub fine-grained	`github_pat_...`	`/github_pat_[\w]{82}/`
Slack tokens	`xoxb-...`, `xoxp-...`, `xoxa-...`	`/xox[abpr]-[\w-]+/`
Stripe	`sk_live_...`, `pk_live_...`, `rk_live_...`	`/(sk\|pk\|rk)_live_[\w]{24,}/`
Discord bot	`Bot [base64]`	`/Bot\s+[\w.-]+/`
JWT	`eyJ...three-parts...`	`/eyJ[\w-]+\.[\w-]+\.[\w-]+/`
Home paths	`/Users/`, `/home/`, `C:\Users\*`	Replace with `/Users/[REDACTED]/...`
Email addresses	`foo@bar.com`	`/[\w.+-]+@[\w-]+\.[\w.-]+/` → `[REDACTED_EMAIL]`
Private keys	`-----BEGIN ... PRIVATE KEY-----` blocks	Full block replaced
IPv4 addresses	`192.168.*`	Partially redacted: `192.168.x.x`

Canonical implementation: src/process/utils/redaction.ts.

Shannon entropy audit

Pattern matching misses novel secret formats (new providers, custom tokens, base64-encoded secrets that don't match known patterns). The entropy audit is the safety net.

How it works

For every remaining string after pattern scrubbing:

Chunk into tokens ≥ 20 characters
Compute Shannon entropy per token
If entropy > 4.5 bits per character (for alphanumerics), flag as high-entropy
Drop the entire containing item (entire trajectory or entire summary)

Rationale: high-entropy long strings are overwhelmingly likely to be secrets (API keys, hashes, nonces). Rather than try to mask inline, we drop the whole trajectory — losing one learning is better than leaking one secret.

Tuning

The 4.5 threshold is conservative (will also reject some legitimate high-entropy text like UUIDs or hashes used as identifiers). Adjust via:

TITANX_LEARNING_ENTROPY_THRESHOLD=4.8  # stricter
TITANX_LEARNING_ENTROPY_THRESHOLD=4.2  # looser

We recommend not changing unless you have a specific reason.

What drops vs. gets through

Drops (entire trajectory discarded)

String contains Anthropic/OpenAI/Google key patterns
String contains unredacted JWT
High-entropy token detected
Private key block detected
Contains -----BEGIN PGP or similar key armor

Passes through (after scrubbing)

Task descriptions with no secrets
Tool sequences (tool names + arg shapes, but arg values scrubbed for paths)
Result snippets (first 200 chars, post-scrub)
Memory summaries (text, post-scrub)

What never lives in trajectories

By design, these never even enter the capture phase:

LLM provider API responses containing user-specific data beyond the turn
File contents (agents pass paths in tool calls; the scrubber preserves paths but loses home-dir info; actual file bytes never reach trajectories)
Environment variables (never in trajectories)
OAuth flow responses (captured only if explicitly printed, which they shouldn't be)

Rate limits + caps

Beyond content redaction:

500 trajectories per device per 24h (checkTrajectoryQuota on master)
500 KB max envelope size
100 trajectories max per envelope
50 memory summaries max per envelope

If a slave produces more, the lowest-ranked items (by success_score × usage_count) are dropped to fit the cap.

Master-side ingestion

Master writes received envelopes to fleet_learnings without additional scrubbing. Rationale: slaves are the authority on their own data; re-scrubbing on master adds no security (if it leaked, it leaked).

Master's fleet_learnings table is considered sensitive — access controlled by the admin password + device keys. Backup your master's DB encrypted.

Broadcast + apply

Consolidated learnings (output of the dream pass) are also never re-scrubbed. By the time they reach consolidated_learnings, they've been through redaction already. Broadcasts go as-is via the config bundle.

Slaves receiving consolidated learnings: the slave operator should trust master's output. If a slave operator doesn't trust master, Dream Mode fundamentally doesn't fit their threat model.

Never-logged policy

Specific rules for the redaction boundary:

activity_log never logs learning payloads verbatim. Only metadata: trajectory count, envelope size, push duration, dream pass stats.
console.log never prints trajectory contents in release builds. Dev builds may show metadata only.
Stack traces thrown during push/ingest include size/count only, not content.

Testing the redaction pipeline

Integration tests in the main repo validate the redaction:

src/process/utils/redaction.test.ts — unit tests per pattern
fleetLearning/index.test.ts — integration test that seeds a trajectory with sk-abc123, runs slave push, asserts the API key is not in the pushed payload

Run yourself:

bun run test -- redaction

All green = redaction is working. If you add new secret formats, add new tests + patterns.

Opting out (safe fallback)

If you can't accept Dream Mode's redaction-is-best-effort posture:

Per-device opt-out — slave operator disables Dream Mode in Settings → Fleet Learning
Org-wide opt-out — master sets fleet.learning.globalDisabled = true

Dream Mode off = no envelope ever leaves the slave. No content redaction challenge exists because no content is attempted.

Specific risks

Risk: novel secret format slips through

Mitigation: entropy audit catches high-entropy strings even without a pattern match. Periodically review consolidated_learnings for unexpected content.

Risk: redacted value still identifies the user indirectly

E.g., a slave's task description of "Fix bug in my-internal-project-X" redacts nothing but tells master which internal project the user was working on. Mitigation: workspace scoping — workspaces with sensitive project names should have workspace_id set explicitly so consolidation stays local.

Risk: master is compromised

Mitigation: out of scope. The threat model assumes master is the trusted root. If you don't trust your master, don't run a fleet.

Risk: redaction bug in a new release

Mitigation: run the test suite in CI. Release notes call out any redaction-pattern changes. Operators deploy pre-prod first on material releases.

Audit

Every push logs (not content, but metadata):

Slave-side: fleet.learning.pushed {trajectoryCount, bytes, duration}
Master-side: fleet.learning.ingested {deviceId, counts}
Scrubber drops: fleet.learning.scrubbed {drops: N, reason: 'high_entropy|pattern_match|...'}

High drop rates are a signal — either the scrubber is over-aggressive, or there's a lot of sensitive data being captured. Either way, investigate.

Compliance notes

No PII in trajectories by design — agents shouldn't be processing PII that ends up in logs
No HIPAA or GDPR certification — TitanX provides primitives; certification is an organizational exercise
Workspace isolation is the primary tenant-separation mechanism — data from workspace A never enters workspace B's consolidation

See Compliance and Data Residency.

Related pages

Dream Mode Overview — the big picture
Dream Pass Internals — the pipeline details
Security Model — broader security posture
Enabling Dream Mode — turning it on safely
Source: src/process/utils/redaction.ts

TitanX · Enterprise AI Agent Orchestration · Apache-2.0

Docs: Wiki · Technical docs · Releases · Security

Last updated for v2.5.1 — report doc issue · contribute to the wiki

TitanX Wiki

📖 Getting Started

🧩 Core Concepts

👤 End-User Guides

🌐 Fleet Mode

🌙 Dream Mode

🔒 Security

🛠 Developer

📘 Reference

❓ Help

🔗 Outside the wiki

v2.5.1 · 50+ pages · Contribute

Privacy and Redaction

Privacy and Redaction

What leaves the slave (complete inventory)

Pattern-based scrubbing

Patterns stripped

Shannon entropy audit

How it works

Tuning

What drops vs. gets through

Drops (entire trajectory discarded)

Passes through (after scrubbing)

What never lives in trajectories

Rate limits + caps

Master-side ingestion

Broadcast + apply

Never-logged policy

Testing the redaction pipeline

Opting out (safe fallback)

Specific risks

Risk: novel secret format slips through

Risk: redacted value still identifies the user indirectly

Risk: master is compromised

Risk: redaction bug in a new release

Audit

Compliance notes

Related pages

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!