Behaviour-accurate Elixir port of Baileys 7.00rc9 — a WhatsApp Web API library. The goal is a drop-in replacement for Elixir apps currently using Baileys (Node.js) as a sidecar. Same wire behaviour, same protocol semantics, idiomatic Elixir implementation. Targets Elixir 1.19+/OTP 28.
Rust NIFs are currently used for Noise protocol (snow) and the narrow XEdDSA
helper (curve25519-dalek). Signal protocol is pure Elixir.
Reference source: dev/reference/Baileys-master/ (pinned at 7.00rc9)
Do not deliberate about what to implement or how the protocol should behave.
Baileys 7.00rc9 (dev/reference/Baileys-master/) is the authoritative reference for
all wire behaviour, protocol semantics, message formats, handshake flows, feature
scope, and public compatibility promises. When you are unsure what to do:
- Read the Baileys source. Find the corresponding TypeScript file and understand what it does.
- Port the behaviour faithfully. Match the observable behaviour — same messages on the wire, same protocol flows, same error handling semantics.
- Implement idiomatically in Elixir. Use BEAM patterns (processes, supervisors, ETS, pattern matching) instead of JS patterns (callbacks, promises, mutexes). The what comes from Baileys; the how is SOTA Elixir.
- Do not invent new behaviour. If Baileys doesn't do it, neither do we (yet). If Baileys does do it, we must too.
- Do not cargo-cult JS internals. Preserve JS implementation details only when they are part of an observable contract users or peers can depend on.
This means: no asking "should we support X?" — check Baileys. No asking "how should
this handshake work?" — read the Baileys handshake code. No designing from scratch
when a working reference exists 50 feet away in dev/reference/.
| Resource | Path | Purpose |
|---|---|---|
| Phase plans | dev/implementation_plan/01-foundation.md … 12-polish.md |
Detailed specs per phase |
| Overview | dev/implementation_plan/00-overview.md |
Architecture, dependency graph |
| Agent rules | dev/implementation_plan/CLAUDE.md |
Phase workflow, native-first policy |
| Progress | dev/implementation_plan/PROGRESS.md |
Task/file/acceptance-criteria tracker |
| Reference source | dev/reference/Baileys-master/ |
Authoritative Baileys rc.9 behavior and wire semantics |
| Directory | Purpose |
|---|---|
src/Socket/ |
Layered socket composition — connection lifecycle, message send/recv, groups, chats, business, newsletters, communities |
src/Signal/ |
Signal Protocol wrapper (libsignal.ts), LID-to-phone mapping, group sender keys |
src/Utils/ |
27 utility files — crypto, noise handler, media processing, auth, event buffering, mutexes, retry, LTHash, pre-key mgmt |
src/WABinary/ |
Binary XMPP-style node encoding/decoding, JID utilities, protocol constants |
src/WAProto/ |
Protocol Buffer definitions for all WhatsApp message types |
src/Types/ |
TypeScript type definitions (auth, messages, events, groups) |
src/WAM/ |
WhatsApp Analytics/Metrics |
src/WAUSync/ |
User sync protocol |
src/Defaults/ |
Default configuration values |
Each layer wraps the previous and extends it:
makeSocket— base WebSocket lifecycle, Noise handshake, query/sendNode primitivesmakeNewsletterSocket— newsletter subscriptionsmakeChatsSocket— privacy settings, presence, app state syncmakeMessagesSocket— message relay, Signal encryption, device discoverymakeMessagesRecvSocket— message decryption, retry logic, receipt handlingmakeGroupsSocket— group CRUD, participant managementmakeBusinessSocket— business profiles, catalogsmakeCommunitySocket— community creation/linking
Final export makeWASocket returns the fully composed socket object.
Noise Protocol (Transport Security)
- Establishes encrypted transport over WebSocket before any application-level communication
- Curve25519 ECDH key exchange, AES-256-GCM frame encryption, HKDF key derivation, SHA-256 hashing
- Handshake: client ephemeral key → server ephemeral + encrypted static + encrypted cert → derive transport keys
- Post-handshake: separate read/write counters for IV generation; all frames Noise-encrypted
- Implementation:
src/Utils/noise-handler.ts
Signal Protocol (E2E Encryption)
- Per-device message encryption (each recipient device gets separately encrypted copy)
- Two message types: pre-key whisper (session establishment) and standard whisper
- Group messaging via Sender Keys (one encryption op, distribution messages enable decryption)
- Session management: create/validate/store/delete sessions, detect identity key changes
- LID mapping: translates phone numbers ↔ logical device IDs for multi-device protocol
- Implementation:
src/Signal/libsignal.ts,src/Signal/lid-mapping.ts
Binary XMPP (Wire Format)
- WhatsApp's proprietary compact binary encoding of XMPP-style nodes (tag, attributes, children/content)
encodeBinaryNode()/decodeBinaryNode()for serialization- JID utilities for Jabber ID format (user@server, device IDs, LID/PN variants)
- Implementation:
src/WABinary/
Protocol Buffers (Message Serialization)
protobufjsfor all message content (text, media refs, reactions, polls, etc.)- Structures content before Signal encryption
- Implementation:
src/WAProto/
App State Sync ("Syncd")
- Versioned state snapshots with CRC verification and LTHash
- Syncs mute/pin/star/archive across devices via patches
- Implementation:
src/Utils/lt-hash.ts, chats socket layer
Message Types: text, images, video/GIF, audio/voice notes, documents, stickers, contacts/vCards, location, live location, reactions, polls, events, lists, buttons, orders, products, native flows, group invites, URL previews
Message Pipeline: generateWAMessage() → prepareWAMessageMedia() (media) →
relayMessage() → device discovery → per-device Signal encryption → binary encoding →
Noise-encrypted WebSocket write
Message Receiving: queue-based pipeline (online/offline paths), decryption within mutex, auto-retry on pre-key errors, receipt handling (delivery ACKs, read receipts)
Media Handling:
- Streaming architecture (never loads entire buffer into memory)
- AES-256-CBC with random 32-byte media key, HKDF-expanded to 112 bytes (IV + cipher key + MAC key + ref key)
- Type-specific HKDF info strings (
"WhatsApp Image Keys","WhatsApp Audio Keys", etc.) - Upload to WhatsApp CDN via authenticated HTTP; download with encrypted blob decryption
- HMAC-SHA256 authentication (10-byte truncated digest)
Group Management: create, update subject/description, leave, add/remove/promote/demote participants, invite codes (v3/v4), join approval, ephemeral messages, announcement mode
Other: newsletter subscriptions, business profiles/catalogs, community management, presence updates, privacy settings, call events, link previews
Auth State (AuthenticationCreds): noise key pair, pairing ephemeral key pair, ADV
secret key, identity keys, signed pre-key, account/device identity, signal identities,
pre-key tracking, sync timestamps, platform info, pairing code, routing info
Signal Key Store: pre-keys, sessions, sender-keys, sender-key memory, app-state-sync-keys/versions, LID mappings, device lists, identity keys, tokens
QR Code Pairing: server sends QR challenge → client renders → user scans with WhatsApp mobile → multi-device identity established
Phone Number Pairing: requestPairingCode(phoneNumber) → returns numeric code →
user enters in WhatsApp mobile → PBKDF2 (131,072 iterations, SHA-256)
Credential Persistence: each credential type in separate files, signal sessions update on every message (must persist immediately), in-memory caching layer, atomic multi-key read/write with retry
- Connects to
wss://web.whatsapp.comwith routing info as base64 query param - Noise handshake on connect → derive encryption keys → send registration/login node
- Request/response: auto-generated message tags, timeout-based response matching
- Keep-alive: configurable ping interval, disconnect if elapsed > interval + 5s
- Connection states:
connecting→open→close(auto-reconnect unless logged out) - Event buffering: defers non-critical notifications until offline processing completes
- Pre-key upload: minimum interval, concurrent prevention, exponential backoff (3 retries)
- Concurrency: separate mutexes for messages, app state patches, receipts, notifications
Philosophy: use Elixir/Erlang ecosystem first. Erlang :crypto (OTP 28) handles
all cryptographic primitives natively. Rust NIFs are reserved exclusively for complex
protocol crates that have no Elixir/Erlang equivalent.
Erlang :crypto handles (NO Rust NIF):
AES-256-GCM/CBC/CTR, HMAC-SHA256/512, SHA-256, MD5, PBKDF2, Curve25519 ECDH,
Ed25519 sign/verify, random bytes. HKDF implemented in pure Elixir using :crypto.mac/4.
Rust NIFs (Rustler ~> 0.37) — only for crates with no native equivalent:
| Rust Crate | Used For | Why NIF? |
|---|---|---|
snow |
Noise Protocol Framework | No Elixir/Erlang implementation exists |
curve25519-dalek |
XEdDSA sign/verify (~80 lines Rust) | Montgomery↔Edwards key conversion — no native equivalent |
Signal Protocol: Phase 5 implementation in progress The target is a Baileys-compatible Signal boundary in Elixir, with the smallest native surface justified by correctness, interoperability, and measured cost. Do not assume a broad Signal NIF and do not assume a full pure-Elixir ratchet implementation until Phase 5 finalizes that boundary.
What stays in Elixir: connection management, event handling, state machines,
supervision trees, caching, business logic, message routing, all crypto primitives,
Signal protocol, binary encoding/decoding, protobuf (via protox) — everything that
benefits from BEAM concurrency, fault tolerance, or has native support.
| Baileys Dep | Role | BaileysEx Approach |
|---|---|---|
ws |
WebSocket transport | Mint.WebSocket (process-less, explicit encode/decode for Noise layer) |
libsignal |
Signal Protocol | Baileys-compatible Elixir boundary; narrow native helpers only where justified |
whatsapp-rust-bridge |
Crypto (HKDF, MD5) | Erlang :crypto (native — no NIF needed) |
| Noise handshake (custom) | Transport encryption | Rustler NIF wrapping snow crate |
protobufjs |
Protobuf serialization | protox (pure Elixir, good codegen) |
pino |
Structured logging | Logger (stdlib) |
async-mutex |
Concurrency | Process-based (BEAM handles this natively) |
@hapi/boom |
Error objects | Tagged tuples / custom exception structs |
lru-cache / node-cache |
Caching | ETS tables (built-in, concurrent, fast) |
These rules apply to all agents (main session, teammates, subagents). They reinforce
the detailed protocols in dev/implementation_plan/CLAUDE.md — that file is canonical.
This section adds behavioural expectations.
- Baileys is the spec. When unsure what to build or how something should behave,
read the Baileys source in
dev/reference/Baileys-master/. Do not ask — look it up. - Match observable behaviour, not JS internals. Preserve wire behaviour, event semantics, helper return shapes, config semantics, and any public compatibility promise we explicitly make. Use the best Elixir/Erlang implementation underneath.
- Plan before building. Enter plan mode for any non-trivial task (3+ steps or architectural decisions). If execution diverges from the plan, stop and re-plan — don't push through a broken approach.
- Native-first. For every non-trivial decision, evaluate the best path available
today — Elixir 1.19+/OTP 28 primitives, stdlib, battle-tested Hex packages — not
last year's patterns. Rust NIFs only when no native equivalent exists. See
dev/implementation_plan/CLAUDE.md§ Native-First Decision Policy. - Elegance within scope. For the requested change, find the cleanest implementation. "Knowing everything I know now, is there a more elegant way?" But do not expand scope — elegance means less code, not more features.
- Protect the main context window. Offload research, exploration, code review, and
parallel analysis to subagents (
Explore,Agent, teammates). One focused task per subagent. - Use worktree isolation for parallel work. When tasks have non-overlapping file
scope and no decision coupling, dispatch
batch-workerteammates in isolated worktrees. Seedev/implementation_plan/CLAUDE.md§ Task Dispatch Modes. - Throw compute at hard problems. For complex debugging, multi-perspective review, or design exploration — use multiple subagents with competing hypotheses rather than serial iteration in a single context.
- Bug reports: just fix them. Diagnose from logs, errors, and failing tests. Resolve without asking for hand-holding. Zero context switching required from the user.
- Failing CI: go fix it. Don't wait to be told how. Read the failure, trace the root cause, implement the fix, verify it passes.
- Root causes only. No temporary fixes, no workarounds, no suppressing symptoms. Senior developer standards — find and fix the actual problem.
- Same input, same output. Every function must produce identical output for identical input. This is non-negotiable — it's what makes code testable, debuggable, and trustworthy.
- Inject all non-determinism. Sources of randomness (
:crypto.strong_rand_bytes, timestamps, UUIDs, PIDs) must be injectable via function options with sensible production defaults. If a function generates a random key internally, it must also accept akey:option so tests can supply a deterministic one. - Pin known-answer test vectors. Tests must assert against pre-computed literal
values, not recomputed expected values. Asserting
file_sha256 == :crypto.hash(:sha256, plaintext)proves only that SHA-256 is consistent with itself — it does not prove your code computed the hash correctly. Instead, pin the hex digest:assert file_sha256 == <<0xAB, 0xCD, ...>>. - Property tests complement, not replace, vectors. StreamData properties prove invariants (encrypt/decrypt roundtrip, idempotency). Pinned vectors prove correctness against an external reference.
- Prove it works before claiming done. Run all 11 delivery gates from
dev/implementation_plan/CLAUDE.md§ Delivery Gates. None are optional. Evidence before assertions. - After any user correction: update auto-memory (
.claude/projects/.../memory/) with the pattern so the same mistake never repeats. Review memory at session start. - Challenge your own work. Before presenting: "Would this survive code review by someone who understands BEAM semantics?" If not, fix it first.