-
Notifications
You must be signed in to change notification settings - Fork 5
Privacy and Redaction
Dream Mode's trust model depends on nothing sensitive leaving the slave. This page documents exactly what's redacted, how, the audit mechanism, and the privacy guarantees (plus their limits).
When Dream Mode is enabled on a slave, every ~2 hours an envelope pushes to master containing:
| Field | Contents | How it's protected |
|---|---|---|
trajectories[] |
Task descriptions + tool sequences + result snippets | Deep-scrubbed + entropy-audited |
memorySummaries[] |
Agent memory summary text | Deep-scrubbed + entropy-audited |
consumptionFeedback[] |
Trajectory hashes + usage counts | Hashes only, no content |
Every string in trajectories + summaries goes through two rounds of scrubbing:
- Pattern-based substitution
- Shannon-entropy audit (catches what patterns miss)
Only after both rounds does anything leave the slave.
deepScrubForExport() walks every string field and replaces known-secret patterns with [REDACTED].
| Pattern class | Examples | Match |
|---|---|---|
| Anthropic API key | sk-ant-api03-... |
/sk-ant-api\d+-[\w-]+/ |
| OpenAI API key |
sk-..., sk-proj-...
|
/sk-[\w-]{48,}/ |
| Google API key | AIza... |
/AIza[\w-]{35}/ |
| AWS access key |
AKIA..., ASIA...
|
/A(KIA|SIA)[A-Z0-9]{16}/ |
| GitHub PAT |
ghp_..., gho_..., ghs_...
|
/gh[osu]_[\w]{36}/ |
| GitHub fine-grained | github_pat_... |
/github_pat_[\w]{82}/ |
| Slack tokens |
xoxb-..., xoxp-..., xoxa-...
|
/xox[abpr]-[\w-]+/ |
| Stripe |
sk_live_..., pk_live_..., rk_live_...
|
/(sk|pk|rk)_live_[\w]{24,}/ |
| Discord bot | Bot [base64] |
/Bot\s+[\w.-]+/ |
| JWT | eyJ...three-parts... |
/eyJ[\w-]+\.[\w-]+\.[\w-]+/ |
| Home paths |
/Users/*, /home/*, C:\Users\*
|
Replace with /Users/[REDACTED]/...
|
| Email addresses | foo@bar.com |
/[\w.+-]+@[\w-]+\.[\w.-]+/ → [REDACTED_EMAIL]
|
| Private keys |
-----BEGIN ... PRIVATE KEY----- blocks |
Full block replaced |
| IPv4 addresses | 192.168.* |
Partially redacted: 192.168.x.x
|
Canonical implementation: src/process/utils/redaction.ts.
Pattern matching misses novel secret formats (new providers, custom tokens, base64-encoded secrets that don't match known patterns). The entropy audit is the safety net.
For every remaining string after pattern scrubbing:
- Chunk into tokens ≥ 20 characters
- Compute Shannon entropy per token
- If entropy > 4.5 bits per character (for alphanumerics), flag as high-entropy
- Drop the entire containing item (entire trajectory or entire summary)
Rationale: high-entropy long strings are overwhelmingly likely to be secrets (API keys, hashes, nonces). Rather than try to mask inline, we drop the whole trajectory — losing one learning is better than leaking one secret.
The 4.5 threshold is conservative (will also reject some legitimate high-entropy text like UUIDs or hashes used as identifiers). Adjust via:
TITANX_LEARNING_ENTROPY_THRESHOLD=4.8 # stricter
TITANX_LEARNING_ENTROPY_THRESHOLD=4.2 # looser
We recommend not changing unless you have a specific reason.
- String contains Anthropic/OpenAI/Google key patterns
- String contains unredacted JWT
- High-entropy token detected
- Private key block detected
- Contains
-----BEGIN PGPor similar key armor
- Task descriptions with no secrets
- Tool sequences (tool names + arg shapes, but arg values scrubbed for paths)
- Result snippets (first 200 chars, post-scrub)
- Memory summaries (text, post-scrub)
By design, these never even enter the capture phase:
- LLM provider API responses containing user-specific data beyond the turn
- File contents (agents pass paths in tool calls; the scrubber preserves paths but loses home-dir info; actual file bytes never reach trajectories)
- Environment variables (never in trajectories)
- OAuth flow responses (captured only if explicitly printed, which they shouldn't be)
Beyond content redaction:
-
500 trajectories per device per 24h (
checkTrajectoryQuotaon master) - 500 KB max envelope size
- 100 trajectories max per envelope
- 50 memory summaries max per envelope
If a slave produces more, the lowest-ranked items (by success_score × usage_count) are dropped to fit the cap.
Master writes received envelopes to fleet_learnings without additional scrubbing. Rationale: slaves are the authority on their own data; re-scrubbing on master adds no security (if it leaked, it leaked).
Master's fleet_learnings table is considered sensitive — access controlled by the admin password + device keys. Backup your master's DB encrypted.
Consolidated learnings (output of the dream pass) are also never re-scrubbed. By the time they reach consolidated_learnings, they've been through redaction already. Broadcasts go as-is via the config bundle.
Slaves receiving consolidated learnings: the slave operator should trust master's output. If a slave operator doesn't trust master, Dream Mode fundamentally doesn't fit their threat model.
Specific rules for the redaction boundary:
-
activity_lognever logs learning payloads verbatim. Only metadata: trajectory count, envelope size, push duration, dream pass stats. -
console.lognever prints trajectory contents in release builds. Dev builds may show metadata only. - Stack traces thrown during push/ingest include size/count only, not content.
Integration tests in the main repo validate the redaction:
-
src/process/utils/redaction.test.ts— unit tests per pattern -
fleetLearning/index.test.ts— integration test that seeds a trajectory withsk-abc123, runs slave push, asserts the API key is not in the pushed payload
Run yourself:
bun run test -- redactionAll green = redaction is working. If you add new secret formats, add new tests + patterns.
If you can't accept Dream Mode's redaction-is-best-effort posture:
- Per-device opt-out — slave operator disables Dream Mode in Settings → Fleet Learning
-
Org-wide opt-out — master sets
fleet.learning.globalDisabled = true
Dream Mode off = no envelope ever leaves the slave. No content redaction challenge exists because no content is attempted.
Mitigation: entropy audit catches high-entropy strings even without a pattern match. Periodically review consolidated_learnings for unexpected content.
E.g., a slave's task description of "Fix bug in my-internal-project-X" redacts nothing but tells master which internal project the user was working on. Mitigation: workspace scoping — workspaces with sensitive project names should have workspace_id set explicitly so consolidation stays local.
Mitigation: out of scope. The threat model assumes master is the trusted root. If you don't trust your master, don't run a fleet.
Mitigation: run the test suite in CI. Release notes call out any redaction-pattern changes. Operators deploy pre-prod first on material releases.
Every push logs (not content, but metadata):
- Slave-side:
fleet.learning.pushed {trajectoryCount, bytes, duration} - Master-side:
fleet.learning.ingested {deviceId, counts} - Scrubber drops:
fleet.learning.scrubbed {drops: N, reason: 'high_entropy|pattern_match|...'}
High drop rates are a signal — either the scrubber is over-aggressive, or there's a lot of sensitive data being captured. Either way, investigate.
- No PII in trajectories by design — agents shouldn't be processing PII that ends up in logs
- No HIPAA or GDPR certification — TitanX provides primitives; certification is an organizational exercise
- Workspace isolation is the primary tenant-separation mechanism — data from workspace A never enters workspace B's consolidation
See Compliance and Data Residency.
- Dream Mode Overview — the big picture
- Dream Pass Internals — the pipeline details
- Security Model — broader security posture
- Enabling Dream Mode — turning it on safely
- Source:
src/process/utils/redaction.ts
TitanX · Enterprise AI Agent Orchestration · Apache-2.0
Docs: Wiki · Technical docs · Releases · Security
Last updated for v2.5.1 — report doc issue · contribute to the wiki
📖 Getting Started
🧩 Core Concepts
- Architecture Overview
- Agents and Teams
- Agent Gallery and Templates
- ACP Runtimes
- MCP Servers
- Workspaces
- Reasoning Bank
👤 End-User Guides
- Hiring Agents from the Gallery
- The Sprint Board
- Conversations and Chat UI
- Using Custom Assistants
- Skills Hub
- Cron and Scheduled Tasks
- Observability
- Caveman Mode
🌐 Fleet Mode
- Fleet Mode Overview
- Master Setup Guide
- Slave Enrollment
- Agent Farm Setup
- Publishing Agent Templates
- Command Center
- Device Forensics and Revocation
🌙 Dream Mode
- Dream Mode Overview
- Enabling Dream Mode
- Dream Pass Internals
- Consolidated Learnings Dashboard
- Privacy and Redaction
🔒 Security
- Security Model
- IAM Policies
- Audit Logging
- Device Identity and Signing
- Secrets Management
- Compliance and Data Residency
🛠 Developer
- Development Setup
- Project Structure
- Code Conventions
- Testing
- Adding an ACP Runtime
- Adding an MCP Server
- Pull Request Workflow
📘 Reference
- Configuration Keys
- Environment Variables
- IPC Channels
- Database Schema
- Fleet Command Types
- Telemetry Shape
- CLI and Keyboard Shortcuts
❓ Help
🔗 Outside the wiki
v2.5.1 · 50+ pages · Contribute