Issue: #1341
This document describes how WhatsApp Web data captured by the desktop scanner becomes available to the agent. It exists to clear up the most common confusion: there are two local storage paths and they are intentional, not duplicates — each backs a different agent capability.
┌────────────────────────┐
│ WhatsApp Web (CEF view)│
└────────────┬───────────┘
│ CDP scan tick
▼
┌────────────────────────────────────┐
│ app/src-tauri/src/whatsapp_scanner │
│ (DOM + IndexedDB merge) │
└─────┬───────────────────────┬──────┘
│ exact rows │ canonicalised transcript
▼ ▼
┌──────────────────────┐ ┌──────────────────────────┐
│ openhuman.whatsapp_ │ │ openhuman.memory_doc_ │
│ data_ingest │ │ ingest │
│ (internal-only RPC) │ │ (internal-only RPC) │
└──────────┬───────────┘ └─────────────┬────────────┘
▼ ▼
┌──────────────────────┐ ┌──────────────────────────┐
│ whatsapp_data.db │ │ memory tree │
│ (SQLite, per-account)│ │ (per-source summaries + │
│ - wa_chats │ │ embeddings) │
│ - wa_messages │ │ │
└──────────┬───────────┘ └─────────────┬────────────┘
▼ ▼
┌──────────────────────┐ ┌──────────────────────────┐
│ Agent tools │ │ Agent tools │
│ whatsapp_data_* │ │ memory_tree_* │
│ (exact lookup) │ │ (semantic / cross-src) │
└──────────────────────┘ └──────────────────────────┘
Both ingest endpoints fire on every scan tick; both are tokio::spawn fire-and-forget so the scanner never blocks on either HTTP call.
| Path | Backing store | Strength | Use it for |
|---|---|---|---|
| Direct | whatsapp_data.db (SQLite) |
Exact, structured, paginated | "List my WhatsApp chats", "show the last 50 messages with Alice", "search for invoice across WhatsApp" |
| Memory tree | Per-source memory tree + embeddings | Semantic, cross-source | "Summarise this week of WhatsApp", "find action items across email and WhatsApp", "what did the team agree on?" |
The same scan tick populates both stores. Idempotency keys make the dual-write safe to retry:
whatsapp_data_ingestkeys on(account_id, chat_id, message_id)— UPSERT.memory_doc_ingestkeys on(namespace, key)where namespace iswhatsapp-web:<account_id>and key is<chat_id>:<day>— also UPSERT.
If one path fails (network blip, store init race), the other still progresses. The next scan tick converges both stores.
The scanner write-path RPCs are registered as internal-only in src/core/all.rs under build_internal_only_controllers. They are reachable over JSON-RPC but invisible to the agent's tool catalog and to schema discovery (all_controller_schemas). The agent has no way to call whatsapp_data_ingest or memory_doc_ingest — accidentally or otherwise.
The agent surfaces are exclusively read-only:
src/openhuman/tools/impl/whatsapp_data/—whatsapp_data_list_chats,whatsapp_data_list_messages,whatsapp_data_search_messages. All three wrap their RPC counterparts and emit a"provider": "whatsapp"tag in the response so the agent can cite WhatsApp as the source.src/openhuman/tools/impl/memory/tree/— genericmemory_tree_*tools. Filter bysource_kind: "chat"or query directly; WhatsApp chat-day transcripts are taggedwhatsappso they surface in cross-source flows.
The orchestrator's agent.toml exposes the three direct WhatsApp tools alongside the generic memory_tree_* family. That choice is deliberate — adding more provider-specific tools would compete with the memory-tree tools for the same intents and fragment routing. The combination satisfies the three real shapes of WhatsApp request:
- Exact lookup ("what was my last message with Bob") →
whatsapp_data_list_messagesafterwhatsapp_data_list_chats. - Keyword search ("did anyone mention
Q3on WhatsApp") →whatsapp_data_search_messages. - Summarisation / action items / cross-source ("what came up across WhatsApp and email this week") →
memory_tree_query_source { source_kind: "chat" }ormemory_tree_query_global.
If a future intent doesn't fit any of these, the right move is usually a new memory-tree retrieval primitive, not a new provider-specific tool.
Prior to #1341 the read-only RPC controllers existed and were callable over JSON-RPC, but no Tool impl wrapped them and the orchestrator didn't list them — so the agent could see WhatsApp data only through the memory tree. That worked for summaries but failed on exact-lookup intents because the memory tree's per-day transcript granularity loses the structure the user asks about (sender JID, exact chat_id, per-message timestamp). Adding the three direct tools closed that gap without adding any new ingest path.