Skip to content

feat: harvest LID mappings from GetUserDevices usync response#1

Open
Prodigy90 wants to merge 43 commits into
mainfrom
fix/getuserdevices-lid-harvest
Open

feat: harvest LID mappings from GetUserDevices usync response#1
Prodigy90 wants to merge 43 commits into
mainfrom
fix/getuserdevices-lid-harvest

Conversation

@Prodigy90
Copy link
Copy Markdown
Owner

What

In the GetUserDevices usync parsing loop (user.go:481-492), extract <lid val="...@lid"> children from each <user> response node and persist via Store.LIDs.PutManyLIDMappings after the loop. Mirrors the existing pattern in GetUserInfo (user.go:232-237).

Why

WA's usync device-query response on the message namespace carries the requester's PN↔LID mapping inline for every user whose identity has migrated to LID. The data flows through this function on every status send (prepareMessageNodeGetUserDevices(participants)) but is currently discarded.

The downstream consequence: when WA delivers a status to one of these recipients' devices and the device acks back in LID form, the application's Store.LIDs.GetPNForLID lookup misses, the receipt is attributed to the LID JID string, and the same human's subsequent PN-form acks get counted as a separate human. This inflates per-message delivered counts by roughly 1.5× on any account with a meaningful LID-migrated audience.

Baileys does the equivalent in src/Socket/messages-send.ts (storeLIDPNMappings over the usync result entries with lid set), confirming the response shape carries this data on the message-namespace usync.

How

  • New lidMappings slice pre-allocated before the loop
  • Inside the loop, after the existing device-list parse, read user.GetChildByTag("lid") and append to the slice when val is non-empty
  • After the loop, single PutManyLIDMappings call. INFO log when the slice is non-empty so operators can verify the harvest works in prod

Verification

After deploy, expect an INFO log like GetUserDevices: harvested N LID mappings from M input JIDs on every status send with non-trivial recipient overlap with the LID-migrated audience. Operators can also SELECT COUNT(*) FROM whatsmeow_lid_map before/after a status send to confirm growth.

Backend pair

Paired with backend PR (Prodigy90/whatsmeow_bot) that bumps this submodule and adds a receipt-handler canonicalization to dedupe LID/PN form-dupes into a single Redis set entry. Together they restore status_send_history.delivered_count accuracy.

Test plan

  • go build ./... clean
  • go test ./... (no regressions; package has no test files for user.go so coverage is mainly indirect)
  • Deploy to canary, verify INFO log appears on next status send
  • Compare whatsmeow_lid_map row count before/after one full status send for a known instrumented user

tulir and others added 30 commits March 25, 2026 14:56
tulir and others added 13 commits May 6, 2026 13:09
Upstream changes pulled in:
- proto: bumps to v1039406452 (also v1037076227, v1037753511, v1038187123,
  v1038709472, v1038839325) — keeps WA web compat current
- client: cstoken support (tulir#1102), full tctoken lifecycle (tulir#1081)
- retry: catch panics in retry receipt goroutine (415df31), optional semaphore
  for retry receipts (a5594bf) — supersedes our local panic-recovery commit
- msgsecret: include params when failing to decrypt, more secret encrypted
  message types, vote encryption sender selection
- pair: macOS pair client constant, QR code format update, PairClientType
  to string, fix platform ID type
- store: new default client payload fields, deleted device protection,
  GetManyIdentities/PutManyIdentities batch ops also present upstream now
- security: constant-time byte comparisons in appstate/handshake/pair,
  noise certificate validity check, ephemeral key length check
- appstate: send sync error event more consistently
- handshake: check noise certificate validity
- send: edit attribute on pin-in-chat messages, on-demand history sync attrs
- download: sticker pack fetching, upload: media delete method
- notification: fix handling own device list changes
- dependencies: update + Go 1.26.2 toolchain

Conflict resolved in receipt.go: kept upstream's `tryHandleRetryReceipt`
helper (panic recovery + optional semaphore — strict upgrade over our
local cdabef9) while preserving our c542190 `handleGroupedReceipt`
changes for WhatsApp's new status broadcast receipt format (message_id
attr fallback + per-user receipt-type parsing).

All other local features verified intact: Participants field for status
recipients, ExcludeDevices + typed 429 fields (RejectedParticipant,
BackoffSeconds, RawErrorResponse), parallel encryption with GOMAXPROCS
semaphore + identity/session caches, HistorySyncNotification blob release.

Build: go build ./... exit 0. Vet: clean.
Two findings from the post-sync audit, applying the same lens tulir's
tryHandleRetryReceipt (415df31 + a5594bf) raised over our cdabef9:

P1 (user.go) — OnNoDeviceContacts callback ran in an unwrapped goroutine.
A panic in application code would crash the entire worker pod. Same
failure mode tulir's wrapper just protected against for retry receipts.
Fix:
- Extract to tryOnNoDeviceContacts named helper
- defer recover() with debug.Stack() so deep callback panics are
  diagnosable from a single log line
- Recover handler references only the panic value and stack — no data
  that could itself be the panic cause (avoids second-fault hazard)
- Add opt-in cli.noDeviceContactsSema (*semaphore.Weighted) + setter
  SetMaxParallelNoDeviceContactsCallbacks mirroring the existing
  SetMaxParallelRetryReceiptHandling pattern. Defaults nil (unlimited)
  for backward compat.

P2 (send.go) — parallel encryption recover logged only the panic value
(%v), discarding the stack. When libsignal panics deep (e.g. nil
SenderKeyDistributionMessage.Serialize, issue tulir#927), the recovered
error was "panic encrypting for X: runtime error: invalid memory
address" with no call chain. Now appends debug.Stack() so we can
localize the actual offending line.

Build clean, vet clean. Both changes are surface-only safety — no
behavior change on the success path.
When the usync device-query response carries `<lid val="...@lid">` children
on `<user>` nodes, extract them and persist via Store.LIDs.PutManyLIDMappings.
Mirrors what GetUserInfo already does at user.go:232-237.

Closes the gap where send-path GetUserDevices calls would route status broadcasts
to a recipient's LID-AD device but never persist the implied PN<->LID partnership.
With the mapping uncaptured, downstream LID-form receipts couldn't be canonicalized
to PN, inflating per-message delivered_count by ~1.5x (same human counted twice,
once via LID-server ack, once via PN-server ack).

Baileys does the equivalent in src/Socket/messages-send.ts (storeLIDPNMappings on
the usync result filter for entries with `lid` set), confirming the response shape
carries this data on the message-namespace usync.

Info log when mappings>0 for in-prod verification — expect one log per non-trivial
status send showing the harvest count.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants