Skip to content

ESP32 Border Router crashes during commission with Aqara and IKEA devices #1772

@xbln

Description

@xbln

Summary

Building a Matter commissioner on ESP-Thread-Border-Router-Board v1.x
(ESP32-S3 + ESP32-H2 RCP) using esp-matter examples/controller as a base.

Commissioning works fine with my own floor3-DK Thermostat (nRF54L15
running connectedhomeip 1.5 / OpenThread, ICD/Thread). It fails
deterministically
with two third-party Matter-over-Thread devices:

  1. Aqara Climate Sensor W100 (VID 0x115F, PID 0x2004) —
    the same family as in #1532.
  2. IKEA Timmerflotte Thermometer (same family as reported by
    xbln in #1532, 2026-05-10).

The exact same pair codes commission successfully on a Linux BlueZ host
running chip-tool pairing code-thread --bypass-attestation-verifier 1.

After the failed MTU exchange the ESP32-S3 crashes with Guru Meditation Error: Core 0 panic'ed (LoadProhibited) and reboots.


Toolchain / Setup

Component Version
Hardware ESP-Thread-Border-Router-Board (ESP32-S3 + ESP32-H2 RCP)
ESP-IDF v5.4.4
esp-matter release/v1.5 @ 92a7abe5 (2026-05-22)
connectedhomeip cf84d0360c (2025-12-09) — includes [PR #40733 multi-UUID16 scanner fix]
Example base examples/controller
Floor4 patches ChipDeviceScanner.cpp: passive=0 (active scan), CHIP_SHELL_MAX_LINE_SIZE 512, CONFIG_BT_NIMBLE_ENABLE_CONN_REATTEMPT=n

Relevant sdkconfig overrides:

# OTBR board: 8 MB flash
CONFIG_ESPTOOLPY_FLASHSIZE_4MB=n
CONFIG_ESPTOOLPY_FLASHSIZE_8MB=y

# CSL transmitter for ICD child support (Stage 2 / Stufe 2 in our project)
CONFIG_OPENTHREAD_CSL_ENABLE=y
CONFIG_OPENTHREAD_MLE_MAX_CHILDREN=16

# Stage 3a: try to avoid reconnect race against chip-stack
CONFIG_BT_NIMBLE_ENABLE_CONN_REATTEMPT=n

Plus the sdkconfig.defaults.otbr from examples/controller (BLE

  • WiFi + OpenThread BR enabled, partitions_br.csv,
    LWIP_IPV6_NUM_ADDRESSES=12).

Floor4 application code calls pairing_command::pairing_code_thread()
via a small HTTP API. We add a DeviceAttestationDelegate to bypass
attestation verification for Aqara (whose PAA is not in the default
trust store):

class TrustingAttestationDelegate : public chip::Credentials::DeviceAttestationDelegate {
    chip::Optional<uint16_t> FailSafeExpiryTimeoutSecs() const override { return chip::NullOptional; }
    void OnDeviceAttestationCompleted(
        chip::Controller::DeviceCommissioner *cmsnr, chip::DeviceProxy *device,
        const chip::Credentials::DeviceAttestationVerifier::AttestationDeviceInfo &info,
        chip::Credentials::AttestationVerificationResult result) override {
        cmsnr->ContinueCommissioningAfterDeviceAttestation(
            device, chip::Credentials::AttestationVerificationResult::kSuccess);
    }
    bool ShouldWaitAfterDeviceAttestation() override { return false; }
};

The pair is invoked via:

CommissioningParameters params =
    CommissioningParameters().SetThreadOperationalDataset(dataset_span);
params.SetDeviceAttestationDelegate(&s_attestation_delegate);
commissioner->RegisterPairingDelegate(&pairing_command::get_instance());
commissioner->PairDevice(node_id, payload, params, DiscoveryType::kAll);

Reproduction A — Aqara W100 (esp-matter v1.5, both PR #40733 and CONN_REATTEMPT=n active)

User triggers Aqara into commissioning advertisement, then:

I [floor4] commission node=3679816925 code=03679816925 (BLE-Thread, dataset=111 B)
I chip[CTL]: Starting commissionable node discovery over BLE
I NimBLE: GAP procedure initiated: discovery;
I NimBLE: own_addr_type=1 filter_policy=0 passive=0 limited=0 filter_duplicates=1
I NimBLE: duration=60000ms
I chip[BLE]: Device Discriminator match. Attempting to connect
I NimBLE: GAP procedure initiated: connect;
I NimBLE: peer_addr=d4:61:fc:03:c2:86
I NimBLE: scan_itvl=16 scan_window=16 itvl_min=24 itvl_max=40 latency=0 supervision_timeout=256
I NimBLE: GATT procedure initiated: exchange mtu
E chip[DL]: Disabling CHIPoBLE service due to error: ac
I NimBLE: GAP procedure initiated: stop advertising.
I NimBLE: GAP procedure initiated: terminate connection; conn_handle=1 hci_reason=19
Guru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled.

CHIP error 0xac = CHIP_ERROR_INTERNAL. The connection is then
terminated by the central (us, hci_reason=19 = "remote user terminated
connection"), and a few hundred ms later the firmware traps in
LoadProhibited and reboots.

Compared to esp-matter v1.4.2 (which does not include
PR #40733),
the Aqara is now discovered correctly thanks to the multi-UUID16
scanner fix — the discriminator match is the new green step. The MTU
phase still fails the same way.

Reproduction B — IKEA Timmerflotte (esp-matter v1.4.2, same failure as v1.5)

Same flow, different device MAC:

I [floor4] commission node=26320938479 code=26320938479 (BLE-Thread, dataset=111 B)
I chip[CTL]: Starting commissionable node discovery over BLE
I chip[BLE]: Device Discriminator match. Attempting to connect
I NimBLE: GAP procedure initiated: connect;
I NimBLE: peer_addr=dc:d9:ce:c5:a8:7f
I NimBLE: GATT procedure initiated: exchange mtu
E chip[DL]: Disabling CHIPoBLE service due to error: ac
I NimBLE: GAP procedure initiated: terminate connection; conn_handle=1 hci_reason=19
Guru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled.

Bit-identical failure signature to the Aqara. This matches
xbln's May 2026 comment in #1532.

Reproduction C — control: works with chip-tool on Linux BlueZ

Same Aqara, same pair code, paired in seconds with:

chip-tool pairing code-thread 5001 hex:<dataset> 03679816925 \
    --bypass-attestation-verifier 1

Log excerpt:

[CTL] Starting commissionable node discovery over BLE
[BLE] Device discriminator match. Attempting to connect.
[BLE] New device connected: D0:03:73:6A:93:23
[BLE] subscribe complete, ep = ...
[BLE] peripheral chose BTP version 4; central expected between 4 and 4
[BLE] using BTP fragment sizes rx 244 / tx 244.
[CTL] PASE session established with commissionee.
...
[CTL] Commissioning complete for node ID 0x0000000000001389: success

So the Aqara/IKEA devices are well-formed; the bug is in the ESP32-S3
NimBLE-based commissioner path, specifically around the GATT MTU
exchange handling, plus an additional unhandled-exception path that
turns a recoverable BLE error into a board reboot.

Reproduction D — control: same bug in stock examples/controller

We re-built the stock esp-matter/examples/controller example
(no floor4 modifications, only the default sdkconfig.defaults,
sdkconfig.defaults.esp32s3 and sdkconfig.defaults.otbr), erased
flash, booted with no WiFi credentials so Wi-Fi STA never associates,
and triggered the pair via the built-in shell:

matter esp controller pairing code-thread 5001 <hex-thread-dataset> 03679816925

(after locally bumping CHIP_SHELL_MAX_LINE_SIZE from 256 to 512 so
the long argument line is not truncated). Same NimBLE startup, same
discriminator match (with passive=0), same error: ac, same crash.

So no Wi-Fi/BLE coexistence interference, no application bug — this
is reproducible on a stock Espressif Matter commissioner example.

What we'd like to know

  1. Has anyone got a working commissioner setup on ESP-Thread-Border-Router-Board
    that pairs a Matter-over-Thread device whose PAA is not in the
    default trust store (Aqara 0x115F, IKEA 0x117C, Nanoleaf, …)?

  2. The error 0xac (CHIP_ERROR_INTERNAL) is raised from
    BLEManagerImpl::HandlePlatformSpecificBLEEvent after our central
    initiated the GATT MTU exchange. What is the expected handler path
    on the central side for BLE_GAP_EVENT_MTU? In our build it falls
    through default: in OnGapEvent() without setting err, yet
    Disabling CHIPoBLE service due to error: ac is logged — implying
    the error originates from a previously-posted event handler that we
    have not been able to identify from the source.

  3. Beyond PR #40733
    (already in esp-matter v1.5 via cf84d0360c), is there a candidate
    patch or branch addressing the MTU-exchange / CHIPoBLE central
    commissioning path against Aqara/IKEA-style peripherals? Should we
    test ESP-IDF v6.0.1 with a separate esp-matter, or is that
    officially unsupported?

  4. The LoadProhibited crash after the BLE teardown is a separate bug
    (the MTU error itself is recoverable in principle). Capturing a
    useful backtrace via idf.py monitor is tricky because the same
    serial line is held by our application; we can re-run with a clean
    monitor session if that would help.

Happy to ship a full sdkconfig, a git diff of our two-line
ChipDeviceScanner patch, the full filtered serial log and a memory
dump on request.


Appendix: filtered serial transcripts

A. Aqara on esp-matter v1.5 (full pair attempt + crash, OPENTHREAD RIO noise removed)

I [floor4] commission node=3679816925 code=03679816925 (BLE-Thread, dataset=111 B)
I chip[CTL]: Stopping commissionable node discovery over DNS-SD
I chip[CTL]: Starting commissionable node discovery over BLE
I chip[CTL]: Skipping commissionable node discovery over Wi-Fi PAF since not supported by the controller!
I chip[CTL]: Skipping commissionable node discovery over NFC since not supported by the controller!
I chip[CTL]: Starting commissionable node discovery over DNS-SD
I NimBLE: GAP procedure initiated: discovery;
I NimBLE: own_addr_type=1 filter_policy=0 passive=0 limited=0 filter_duplicates=1
I NimBLE: duration=60000ms
I chip[BLE]: Device Discriminator match. Attempting to connect
I NimBLE: GAP procedure initiated: connect;
I NimBLE: peer_addr_type=1 peer_addr=d4:61:fc:03:c2:86
I NimBLE: scan_itvl=16 scan_window=16 itvl_min=24 itvl_max=40 latency=0 supervision_timeout=256 min_ce_len=0 max_ce_len=0 own_addr_type=1
I NimBLE: GATT procedure initiated: exchange mtu
E chip[DL]: Disabling CHIPoBLE service due to error: ac
I NimBLE: GAP procedure initiated: stop advertising.
I NimBLE: GAP procedure initiated: terminate connection; conn_handle=1 hci_reason=19
Guru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled.

B. IKEA on esp-matter v1.4.2 (identical signature)

I [floor4] commission node=26320938479 code=26320938479 (BLE-Thread, dataset=111 B)
I chip[CTL]: Starting commissionable node discovery over BLE
I NimBLE: GAP procedure initiated: discovery;
I NimBLE: own_addr_type=1 filter_policy=0 passive=0 limited=0 filter_duplicates=1
I NimBLE: duration=60000ms
I chip[BLE]: Device Discriminator match. Attempting to connect
I NimBLE: GAP procedure initiated: connect;
I NimBLE: peer_addr_type=1 peer_addr=dc:d9:ce:c5:a8:7f
I NimBLE: scan_itvl=16 scan_window=16 itvl_min=24 itvl_max=40 latency=0 supervision_timeout=256
I NimBLE: GATT procedure initiated: exchange mtu
E chip[DL]: Disabling CHIPoBLE service due to error: ac
I NimBLE: GAP procedure initiated: stop advertising.
I NimBLE: GAP procedure initiated: terminate connection; conn_handle=1 hci_reason=19
Guru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled.

C. chip-tool (Linux BlueZ) on Aqara — same pair code, success

[BLE] BLE removing known devices
[BLE] BLE initiating scan
[BLE] ChipDeviceScanner has started scanning!
[BLE] Device 1C:18:16:4C:8E:E8 does not look like a CHIP device.
[BLE] Device 4E:A1:C5:D5:98:58 does not look like a CHIP device.
[BLE] New device scanned: D0:03:73:6A:93:23
[BLE] Device discriminator match. Attempting to connect.
[BLE] ChipDeviceScanner has stopped scanning!
[DL] ConnectDevice complete
[BLE] New device connected: D0:03:73:6A:93:23
[CTL] Discovered device to be commissioned over BLE
[CTL] Attempting PASE connection to BLE
... (PASE Pake1/2/3 success) ...
[CTL] PASE session established with commissionee. Stopping discovery.
[TOO] PASE establishment successful
[CTL] Commissioning stage next step: 'SecurePairing' -> 'ReadCommissioningInfo'
... (AttestationVerification with --bypass, Continue, CASE Sigma1/2/3) ...
[CTL] Received CommissioningComplete response, errorCode=0
[CTL] Commissioning complete for node ID 0x0000000000001389: success

D. Stock examples/controller on v1.5 (no application code at all)

Same error: ac after GATT MTU exchange. Identical signature. We omit
the boot prelude for brevity; the relevant subset is identical to (A).


Thanks for any pointers!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions