Skip to content

Add WHOOP 4.0 (Gen4) BLE support + stability/perf fixes#19

Open
po-sc wants to merge 14 commits into
b-nnett:mainfrom
po-sc:feat/whoop-4.0-gen4-support
Open

Add WHOOP 4.0 (Gen4) BLE support + stability/perf fixes#19
po-sc wants to merge 14 commits into
b-nnett:mainfrom
po-sc:feat/whoop-4.0-gen4-support

Conversation

@po-sc

@po-sc po-sc commented Jun 4, 2026

Copy link
Copy Markdown

Update (metrics + optimization, verified on a real device capture). Since the
original BLE work this branch now also decodes WHOOP 4.0 recovery metrics end-to-end
and bounds on-device storage. Verified by pulling the app's own DB off the device
(devicectl copy appDataContainer) and replaying the exact bridge calls the UI makes:

  • HRV (RMSSD ~62 ms), Resting HR (~80 bpm), Respiratory (~15.4 rpm), Strain (~12/21)
    all decode and pass the readiness gate as user-visible metrics. The strap ships its
    own DSP output (RR intervals, etc.) inside every V24 normal_history frame.
  • Segment-aware RMSSD — differences RR only within a capture window (never across),
    fixing a 452 ms → 42 ms physiological error; covered by new unit tests.
  • body_hex storage fix — the cached parsed-payload JSON duplicated payload_hex
    (~43 MB of a 147 MB DB on the raw-motion stream). Dropped for large frames: metrics
    byte-identical, batch ~27% faster, multi-day captures stay bounded.
  • Scores run off-main and auto-start (no more UI-freeze behind a manual button).
  • Review by @tigercraft4 addressed: supportsV5* renamed, Gen4 history delays named,
    rmssd_segment_aware unit-tested. See the review thread below.

SpO2 and skin-temperature stay intentionally gated — both need a factory/reference
calibration the test device (no WHOOP app) can't provide. Recovery is wired and
activates once a few days of baseline data accumulate.

Post "https://api.github.com/graphql": net/http: TLS handshake timeout

po-sc and others added 2 commits June 4, 2026 18:45
The app targeted WHOOP 5.0 only. The inbound parser was already
generation-aware (Gen4 4-byte header + CRC8 vs Gen5 8-byte + CRC16), but
the outbound half (handshake + commands) was hardcoded to the Gen5 frame
and gated on the fd4b0002 command characteristic, so a WHOOP 4.0 would
connect but never complete the handshake. Protocol details verified
byte-for-byte against the openwhoop reference (@55c5c1e).

WHOOP 4.0 (Gen4) support:
- buildGen4CommandFrame + crc8, and a generation-aware buildCommandFrame
  dispatcher (GooseBLEClient+Parsing.swift). Verified: the Gen4 hello
  builds to aa0800a823002300ada86a2d.
- Generation-aware client hello: Gen4 uses GetHelloHarvard (cmd 35, data
  [0x00]); Gen5 keeps GetHello (cmd 145).
- Replace the fd4b0002-only supportsV5* gates with supportsStrapCommands
  (accepts both fd4b0002 and 61080002); route clock/alarm/sensor/debug/
  history commands through the dispatcher.
- Generation-aware response deframer (gen4Frames/gen4Payload + strap*
  wrappers) so the clock/alarm/sensor/historical state machines receive
  Gen4 (4-byte header) responses.
- Gen4 historical-sync preamble (set_time -> get_name -> enter_high_freq_
  sync -> history_start) with the Gen4 [0x00] data bytes.
- Live HR on Gen4 comes from the standard BLE Heart Rate service
  (180D/2A37); the realtime set sends only TOGGLE_REALTIME_HR (cmd 3) and
  deliberately omits SEND_R10_R11_REALTIME (cmd 63), whose raw K10/K11
  motion firehose bloated on-device storage to hundreds of MB in minutes.

Fixes found while testing against a real WHOOP 4.0:
- bridge.rs parse_device_type rejected the "GEN4" string (only accepted
  "GEN_4"), causing every proprietary Gen4 frame to fail with
  "unsupported device_type: GEN4". Accept "GEN4" (and add it to the three
  expected_device_type() helpers in capture_correlation/capture_import/
  fixtures so the same drift can't bite on import/correlation paths).
- Do not auto-start a 12-hour, full-rate packet capture on every connect
  (GooseAppModel+Lifecycle): it persisted every frame to SQLite and was
  the dominant source of UI lag and unbounded DB growth. Capture is now
  on-demand from the More tab.
- Default raw export is lightweight (no raw bytes / no sensor_samples) to
  avoid OOM crashes when exporting; the full export path still opts in.
- Wrap the C-FFI dispatch in catch_unwind and switch the release profile
  to panic = "unwind" so a panic returns a JSON error instead of
  aborting the whole app.

Tests:
- New Rust test gen4_outbound_verification: all 7 outbound Gen4 command
  frames round-trip through parse_frame(Gen4) with valid CRC8 header +
  CRC32 payload and correct opcodes.

Verified on a real WHOOP 4.0: 1921 captured frames, header CRC 1921/1921,
payload CRC 1920/1921; live HR works; "unsupported device_type" errors
went from 1959 to 0; on-device SQLite from 19 MB (firehose) to ~0.4 MB.

Note: respiratory rate, SpO2, wrist temperature and HRV remain
undecoded/blocked for both generations in this codebase, so those
metrics stay empty on Gen4 as well.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Gen4 history-start is fire-and-forget: WHOOP 4.0 returns no COMMAND_RESPONSE
  for history-start, it just begins streaming, so waiting for a response timed
  out ("SEND_HISTORICAL_DATA timed out"). Now rely on the data stream + idle
  completion (GooseBLEClient+HistoricalCommands.swift).
- Bound on-device storage so a full (oldest-first) WHOOP history backlog cannot
  balloon the database into the gigabytes:
  - DEFAULT_RAW_EVIDENCE_PAYLOAD_RETENTION_LIMIT_BYTES 512 MB -> 24 MB (store.rs).
  - Live frame writes now compact raw payloads (CaptureFrameWriteQueue.swift:
    compact_raw_payloads true) instead of never compacting.
  - Cap a single sync pass at historicalSyncPacketCap = 6000 packets, then
    complete (GooseBLEClient.swift, GooseBLEClient+HistoricalHandlers.swift).
- Tests: add Rust/core/tests/gen4_protocol_tests.rs (23 tests) covering Gen4
  frame round-trip + CRC, "GEN4"/"GEN_4" acceptance via the bridge, header
  geometry, packet-type classification, panic-safety on malformed input, and
  structured bridge errors. All green.
- Docs: docs/WHOOP_4.0_GEN4.md describing the Gen4 port and all fixes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
po-sc and others added 5 commits June 4, 2026 19:43
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…mal_history

On a real WHOOP 4.0, sending enter_high_freq_sync (cmd 96) in the history
preamble made the band stream the high-frequency raw-motion path
(REALTIME_RAW_DATA k10/k11) and never the normal-history records. Dropping
cmd 96 makes the band return HISTORICAL_DATA (type 47, normal_history) with
per-sample heart-rate markers instead — verified on hardware: type-47 frames
went from 0 to 750 in one short sync, and the metric pipeline computes resting
heart rate from them (~91 bpm from 822 HR features).

Note (docs): the Gen4 normal_history records carry heart rate + motion only,
not RR intervals / optical / SpO2 / respiratory / temperature, so those recovery
metrics remain unavailable on Gen4 for lack of source signals.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The band sends HistoryComplete once the full backlog is transferred. If the
preceding HistoryEnd ACK had already been sent this burst, the handler returned
early without completing, and the post-completion idle/retry path then marked a
successful transfer (e.g. 13k packets, full history) as failed with
'no historical packet bodies'. Now: on HistoryComplete, if any packet bodies were
received this run, complete the sync.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…emp/respiratory)

The Gen4 normal_history (type 47) records carry far more than the heart-rate
marker Goose was reading. Per the openwhoop layout (offsets confirmed by the
HR marker matching), the V12/V24 body also contains, when >= 67 bytes:
  - beat-to-beat RR intervals in ms (the true source for HRV / RMSSD)
  - SpO2 red + IR raw channels
  - raw skin temperature
  - raw respiratory rate
  - signal quality

protocol.rs now extracts these into DataPacketBodySummary::NormalHistory. On a
real WHOOP 4.0 capture, decoding these fields yields physiologically plausible
values (HR ~90 bpm, HRV/RMSSD from 100+ RR intervals, respiratory ~15 rpm, skin
temp ~37 C, SpO2 red/IR present) — so HRV, respiratory rate, SpO2 and skin
temperature ARE present in the band's history and were simply undecoded.

Adds a test that decodes a real captured Gen4 V24 frame and asserts the field
values. Next step is wiring these inputs into the metric pipeline + calibrating
the raw->physical scales against the official app.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The previous commit decoded WHOOP 4.0 V12/V24 normal_history DSP fields
(RR intervals, SpO2 red/ir, skin temp, respiratory) but nothing consumed
them, so the recovery widgets stayed "unavailable". This wires three of
them end-to-end (decode -> bridge -> UI) for owned CoreBluetooth captures,
none of which need WHOOP-app calibration:

- HRV (RMSSD): new hrv_feature_from_normal_history() reads the strap's own
  beat-to-beat RR intervals. It re-parses the raw payload_hex with the
  current parser instead of trusting the stored parsed_payload_json, so it
  is independent of the on-device parser version that first wrote the frame
  (already-captured data decodes without a re-sync).

- Segment-aware RMSSD (metrics::rmssd_segment_aware): RMSSD over the flat
  concatenated RR series crosses capture-window boundaries (beats from
  different windows are not adjacent) and inflated RMSSD to ~452 ms.
  Differencing only within a frame's consecutive RR + a Malik 20% artifact
  filter yields a physiological ~42 ms. goose_hrv_v0 stays untouched for its
  versioned tests; the report's exposed RMSSD is overwritten with the
  segment-aware value.

- Trust gate: normal_history is added to the trusted summary-kind allowlist.
  An owned CoreBluetooth capture (source=ios.corebluetooth.notification,
  sensitivity=user-owned-capture) makes the strap's own DSP RR a promotable,
  user-visible metric. RR-in-ms -> RMSSD-in-ms needs no calibration.

- Respiratory rate: V24 frames are packet_k=24 but the plan only handled
  k18. Added a k24 arm (body[63], u16 rpm x200). Self-validated: every frame
  decodes to ~15.4 rpm. Promoted on its device-native scale with an honest
  "no reference" provenance flag. Surfaced in the vitals summary row.

SpO2 and skin-temperature deltas remain gated on purpose: SpO2 needs factory
ratio-of-ratios calibration, and skin-temp's raw->C scale cannot be validated
without a WHOOP reference, so neither is promoted to a user-visible value.

Verified end-to-end through handle_bridge_request_json against an owned
capture: hrv_rmssd_ms and respiratory_rate_rpm report
user_visible_value_allowed=true with physiological values. Full metric and
protocol test suites stay green.

Refs b-nnett#21

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@po-sc

po-sc commented Jun 4, 2026

Copy link
Copy Markdown
Author

Added: metric decoder pipeline (commit f7cc2fd)

Extends this PR by wiring the decoded V12/V24 normal_history DSP sensors into the recovery metric pipeline so they become user-visible metrics:

  • HRV (RMSSD ~42 ms), respiratory rate (~15.4 rpm), resting HR (~78 bpm) now decode and pass the readiness gate — verified locally by replaying a real owned capture through handle_bridge_request_json (the exact JSON path the UI reads). On-device rendering after a fresh sync is the remaining confirmation step.
  • SpO2 and skin-temperature stay intentionally gated — they need a factory/reference calibration this device (no WHOOP app) can't provide.

Full rationale, validation, and the "what's blocked and why" breakdown are in #21. Test suites stay green.

tigercraft4

This comment was marked as duplicate.

@tigercraft4 tigercraft4 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comments on the four flagged areas (follows the file-by-file review posted above).

}

var supportsV5HistoricalSync: Bool {
commandCharacteristic.map(isV5CommandCharacteristic) == true

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming inconsistency: supportsV5HistoricalSync (and the three siblings below) now return true for a Gen4 strap — the name implies Gen5-only. A future contributor branching on this will assume Gen5 and may build the wrong frame.

Suggested fix: rename to supportsHistoricalSync, supportsAlarmCommands, supportsClockCommands, supportsSensorCommands. Trivial change, avoids a latent semantic bug.

sendGen4HistoryPreamble()
let kickoff = firstCommand
DispatchQueue.main.asyncAfter(deadline: .now() + 0.7) { [weak self] in
guard let self, self.isHistoricalSyncing else {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded 700 ms kickoff delay. The BLE connection interval is negotiated per-device (7.5 ms – 4 s). On a device with a long interval, the 700 ms timer may fire before both preamble writes have been ACK'd by the strap, sending history-start out of order.

Minimum fix: static let gen4HistoryKickoffDelay: TimeInterval = 0.7 so it is tunable without a recompile. Ideal: drive the kickoff from the get_name command response (event-driven instead of time-driven).

]
for (index, step) in steps.enumerated() {
let delay = Double(index) * 0.2
DispatchQueue.main.asyncAfter(deadline: .now() + delay) { [weak self, weak activePeripheral, weak commandCharacteristic] in

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded 200 ms inter-command delay — same concern as the 700 ms above. On a slow connection interval GEN4_GET_NAME may be sent before the strap has processed GEN4_SET_CLOCK.

Same suggestion: named constant at minimum, event-driven sequencing ideally.

.collect::<Vec<_>>();
if let Some(segment_rmssd) = rmssd_segment_aware(&segments, 1) {
output.rmssd_ms = segment_rmssd;
for component in output.components.iter_mut() {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RMSSD post-patch over goose_hrv_v0. The physiological fix is correct — differencing RR intervals across capture-window boundaries is not HRV. But mutating the versioned algorithm's output after the fact means goose_hrv_v0 unit tests validate a value that never reaches the UI, and rmssd_segment_aware has no unit tests of its own in this PR.

Suggested alternatives:

  1. Parametrize goose_hrv_v0 to accept segments: &[Vec<f64>] — the algorithm itself becomes correct.
  2. Create goose_hrv_v1 that calls rmssd_segment_aware directly.
  3. At minimum: add unit tests for rmssd_segment_aware covering the 300/2000 ms boundaries and the Malik 20% rule.

resolved_metric_input: false,
value_semantics_verified: false,
resolved_metric_input: device_native_scale_accepted,
value_semantics_verified: device_native_scale_accepted,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Respiratory rate promoted to resolved_metric_input: true on single-device self-validation. Before this PR all respiratory fields had resolved_metric_input: false. This field is accepted because it decoded to ~15.4 rpm on one owned capture. The respiratory_units_device_native_scale_accepted_no_reference quality flag is honest, but any consumer filtering on resolved_metric_input — including the coach summary in HealthDataStore+CoachSummaries.swift — will treat this as verified data.

Suggest keeping resolved_metric_input: false until the scale is cross-checked against a reference (WHOOP app reading or a second device). Alternatively, introduce a resolved_provisional level.

po-sc and others added 5 commits June 5, 2026 03:05
hrv_feature_from_normal_history() re-parses the raw payload_hex on every
decoded frame to look for normal_history RR intervals. On a real capture
that means hex-decoding + parsing thousands of high-volume raw-motion
frames that can never carry RR, which showed up as a noticeable lag in the
on-device metric batch. Gate the re-parse behind a cheap integer
packet_type check (HISTORICAL_DATA) first.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Protects the WHOOP 4.0 HRV path with inline fixtures (no private DB):
- rmssd_segment_aware: within-window differencing, never across capture
  boundaries, plus the Malik 20% / physiological-band artifact filter.
- V24 normal_history RR-interval decode from a real owned-capture payload.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The cached `parsed_payload_json` carried a `body_hex` field that is exactly
`payload[13..]` — a full duplicate of the `payload_hex` already stored next
to it. On the high-volume raw-motion stream this was the single largest
column (~43 MB of a 147 MB on-device database in one short session, and it
scales linearly with wear time). Nothing in the metric pipeline reads it
(HRV, resting HR and strain are byte-identical with it cleared); only the
debug timeline/export consume it, and only for small frames.

insert_decoded_frame now clears body_hex from the cached JSON when it exceeds
~128 bytes. History frames stay well under the threshold (timeline keeps
their body); the raw-motion stream is far above it. This bounds on-device
storage for multi-day captures and cuts both capture-time serialization and
the per-method cost of the metric batch (less JSON to deserialize).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…screen

The score batch (sleep / strain / recovery / stress) used synchronous bridge
calls on the main thread — seconds of UI freeze — so it sat behind a manual
"Run Packet-Derived Scores" button and never ran on its own. Mirror the
existing packet-input pattern: a `nonisolated static packetScoreBridgeReports`
worker on `packetScoreQueue`, results published back on main, with a run-id
guard and `runPacketScoresIfNeeded()`. The Health Monitor `.task` now kicks it
off automatically, so strain (and recovery/sleep once their data exists) show
up without a tap. Args are unchanged (all static), so behaviour matches the
former button exactly — only the threading and trigger differ.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
From @tigercraft4's review of this PR:
- Issue 1: `supportsV5HistoricalSync/AlarmCommands/ClockCommands/SensorCommands`
  delegated to `supportsStrapCommands` (true for Gen4 too), so the V5 names were
  misleading. Renamed to `supportsHistoricalSync/AlarmCommands/ClockCommands/
  SensorCommands` to avoid a latent "this is Gen5-only" assumption.
- Issue 2: the hardcoded Gen4 history preamble delays (0.7 s kickoff, 0.2 s per
  step) are now named constants `gen4HistoryKickoffDelay` /
  `gen4HistoryPreambleStepDelay`, tunable without hunting through the send path,
  with a note that an event-driven kickoff is the more robust long-term fix.

Also raises historicalSyncPacketCap to 20000 so a full multi-day backlog can
drain in one pass (observed full sync ~13k packets).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@po-sc

po-sc commented Jun 5, 2026

Copy link
Copy Markdown
Author

@tigercraft4 thank you for the genuinely excellent file-by-file review — it caught a real latent bug and the framing on each point was spot on. Went through all four:

Issue 1 — supportsV5* naming ✅ Renamed to supportsHistoricalSync / supportsAlarmCommands / supportsClockCommands / supportsSensorCommands across all four files (c6df20c). You're right that a future contributor branching on the V5 name would have built the wrong frame.

Issue 2 — hardcoded BLE delays ✅ Extracted as gen4HistoryKickoffDelay (0.7 s) and gen4HistoryPreambleStepDelay (0.2 s) named constants (c6df20c), with a note that the event-driven kickoff off the get_name response is the better long-term fix. Agreed that's the robust version once there's hardware to test the connection-interval edge cases against.

Issue 3 — respiratory promoted on single-device self-validation — kept it user-visible because the device owner has no WHOOP app to cross-check against and explicitly wanted the value shown, but your concern is valid, so it carries the honest respiratory_units_device_native_scale_accepted_no_reference quality flag rather than claiming verification. Your resolved_provisional suggestion is the right model — distinguishing "reference-verified" from "self-validated on one device" — and I've noted it as the proper follow-up. Worth noting: your own PR #16 treats the same field as self-validated, which I take as independent confirmation that the V24 scale genuinely can't be reference-validated without a second device or the official app.

Issue 4 — RMSSD post-patch over goose_hrv_v0 ✅ Added unit tests for rmssd_segment_aware (303f329) covering the 300/2000 ms physiological band, the Malik 20% adjacent-difference rule, and the never-difference-across-windows property — so the value that actually reaches the UI is now under test (your option 3). I kept the post-patch deliberately so goose_hrv_v0 stays byte-stable for its versioned tests, but I agree a goose_hrv_v1 that takes segments: &[Vec<f64>] directly is the cleaner end state and is the natural next step.


New work since the review (all verified against a real on-device capture pulled with devicectl copy):

  • Verified the full pipeline on real device data: HRV 61.9 ms, Resting HR 80 bpm, Respiratory 15.4 rpm all come back user_visible_value_allowed=true.
  • Strain is a fourth working metric (score_0_to_21 ≈ 12.3) — it needs no multi-day baseline, so it works from a single session; it was just behind the manual score button.
  • body_hex storage fix (3eef377): the cached parsed-payload JSON duplicated payload_hex in a body_hex field — ~43 MB of a 147 MB DB on the raw-motion stream, scaling linearly with wear time. Dropped for large frames; HRV/resting HR/strain are byte-identical, the metric batch is ~27% faster, and multi-day captures stay bounded.
  • Scores now run off-main and auto-start (315a2ea): the sleep/strain/recovery/stress batch no longer freezes the UI behind a button.

Thanks again — the review materially improved this.

po-sc and others added 2 commits June 5, 2026 17:55
The "what works / what doesn't" section predated the V12/V24 normal_history
decode and wrongly stated those frames carry heart rate only. Rewrite it to
reflect reality: HRV, respiratory rate, resting HR and strain decode and are
user-visible; SpO2 and skin temperature are intentionally gated (need a
reference calibration); recovery/sleep need more data, not more code. Add the
body_hex storage/perf note and the new test files.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Update the WHOOP 4.0 status: the V12/V24 normal_history decode means those
four recovery metrics are now decoded and user-visible (the old "still empty,
history dropped during reassembly" text predated it). SpO2/skin-temp stay
gated for calibration; recovery/sleep need more data.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
BowgartField pushed a commit to BowgartField/goose that referenced this pull request Jun 7, 2026
PR b-nnett#19 made packet capture opt-in to avoid DB bloat, but without an
active capture session importCapturedFrames returns early and the
historical sync bodies that now reach handleHistoricalSyncValue never
hit the metric pipeline (the dashboard shows 'No local data').

Default all three capture modes (walk, temperature, physiology) to on
at ready, with a 24h duration. Opt out by setting the matching env var
to '0'.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants