Linux chip-tool network recovery feature example #38421

dengpan-liu · 2025-04-16T02:05:02Z

This is the Linux platform chip-tool example supports the Network Recovery feature defined in Matter 1.5.

Usage:

For discovering nodes who advertising network recovery message in BLE
chip-tool networkrecovery discover --timeout 10
If don't specific the timeout value, it will take 30 seconds as default for Network Recovery scan.
Once scan completed, the RecoveryIdentifier value from each recoverable device will be listed in the chip-tool log, for example:

[1744899521.197] [1746:1751] [CTL] Discovery timed out
[1744899521.198] [1746:1751] [TOO] Find recoverable devices:
[1744899521.198] [1746:1751] [TOO] 0x8877665544332211
[1744899521.199] [1746:1747] [BLE] BLE removing known devices
[1744899521.202] [1746:1747] [BLE] BLE initiating scan
[1744899521.217] [1746:1751] [BLE] ChipDeviceScanner has started scanning!
[1744899521.218] [1746:1746] [CTL] Shutting down the commissioner

For recovering a node's Wi-Fi network in BLE
chip-tool networkrecovery recover-wifi node-id recovery-identifier ssid password breadcrumb, for example:
chip-tool networkrecovery recover-wifi 1 9833440827789222417 my-ssid my-password 0
For recovering a node's Thread network in BLE
chip-tool networkrecovery recover-thread node-id recovery-identifier operational-dataset breadcrumb, for example:
chip-tool networkrecovery recover-thread 1 9833440827789222417 hex:0e080000000000010000000300001535060004001fffe0020824228a3f6322559a0708fd9917d14c5f5ea1051062ab56ee4a14977b28eaddda12339592030f4f70656e5468726561642d333239640102329d04107bbc1fd5bd41afd8321398d0e17607f20c0402a0fff8 0
The chip-tool log when network recovery success:

[1744907638.711] [164591:164602] [DMG] },
[1744907638.711] [164591:164602] [DMG] Received Command Response Data, Endpoint=0 Cluster=0x0000_0030 Command=0x0000_0005
[1744907638.711] [164591:164602] [CTL] Received CommissioningComplete response, errorCode=0
[1744907638.711] [164591:164602] [CTL] Network Recovery succeeded.
[1744907638.711] [164591:164602] [TOO] Recovery complete for devicice 0000000000000001

Testing

manually validated in WSL2 (Ubuntu 22.04) with sample ESP dongle act as recoverable node.

CLAassistant · 2025-04-16T02:05:08Z

All committers have signed the CLA.

github-actions · 2025-04-16T02:26:15Z

PR #38421: Size comparison from fe31731 to a510121

Full report (1 build for stm32)

platform	target	config	section	`fe31731`	`a510121`	change	% change
stm32	light	STM32WB5MM-DK	FLASH	461384	461464	80	0.0
			RAM	141488	141520	32	0.0

examples/chip-tool/commands/clusters/ModelCommand.cpp

examples/chip-tool/commands/network-recovery/NetworkRecoveryCommand.cpp

bzbarsky-apple · 2025-04-16T05:35:47Z

src/app/OperationalSessionSetup.cpp

+    mState = State::ResolvingAddress;
+    mTransportPayloadCapability = transportPayloadCapability;
+
+    UpdateDeviceData(params.GetPeerAddress(), params.GetMRPConfig());


So just trying to understand this: we are establishing CASE here, right? For a device that is presumably not on-network? So are we doing CASE over some non-IP transport in this case?

We should put some thought into how to make this zero-cost or close to it for nodes that never perform network recovery but do need CASE establishment in general (e.g. for bindings).

bzbarsky-apple · 2025-04-16T05:36:02Z

src/app/OperationalSessionSetup.cpp

@@ -190,6 +191,17 @@ void OperationalSessionSetup::Connect(Callback::Callback<OnDeviceConnected> * on
    Connect(onConnection, onFailure, nullptr, transportPayloadCapability);
 }

+void OperationalSessionSetup::Connect(Callback::Callback<OnDeviceConnected> * onConnection, Callback::Callback<OnDeviceConnectionFailure> * onFailure, 
+                                      RendezvousParameters & params,
+                                      TransportPayloadCapability transportPayloadCapability)


Why does all this stuff have TransportPayloadCapability arguments? I don't expect anything to require TCP for network recovery....

I have no idea, the "TransportPayloadCapability" arguments are specified for all OperationalSessionSetup::Connect functions, just want to keep the format is same.

They're there for the normal case because some things need to happen over TCP. But network recovery does not, in fact, happen over TCP; if you can do TCP you're already on the network, right?

To be clear: running network recovery protocols over IP still makes sense if there is a proxy in the chain somewhere. The proper analogy is that any commissioning transport (anything that's used to setup and conduct a PASE session) should be usable for network recovery; so if we ever define commissioning over TCP, then we would be doing network recovery over TCP

I guess the real question is: would we ever require TCP for network recovery? If not, we can just ask for a CASE session over UDP for network recovery purposes....

But OK, if network recovery can happen over IP then in theory someone could decide they want to do it over TCP, not UDP....

bzbarsky-apple · 2025-04-16T05:40:56Z

src/app/OperationalSessionSetup.cpp

+    // cancel that attempt before we can update the address.
+    if (mState == State::Connecting)
+    {
+        CancelSessionSetupReattempt();


CancelSessionSetupReattempt() does not in fact take us out of the Connecting state, right? It just cancels a timer...

src/app/OperationalSessionSetup.cpp

bzbarsky-apple · 2025-04-16T05:42:41Z

src/controller/CHIPDeviceController.h

@@ -834,6 +858,20 @@ class DLL_EXPORT DeviceCommissioner : public DeviceController,
                                         /* fireAndForget = */ true);
    }

+    void RegisterNetworkRecoverDelegate(NetworkRecoverDelegate * delegate) {


This API naming makes it sound like multiple delegates can be registered.

This API is used for each platform to register its own NetworkRecoverDelegate.

But there is only one delegate that can be registered at a time in the implementation, as far as I could see.

bzbarsky-apple · 2025-04-16T05:44:04Z

src/controller/CHIPDeviceController.h

+    /**
+     * @brief
+     *   Discover all devices advertising as recoverable.
+     *   Should be called on main loop thread.


We don't generally prescribe anything about threads on which things should be called. The rule is that you have to use whatever synchronization you use for the rest of the Matter API (which may be locks, using a single thread, or some other things).

I know you copied this comment from elsewhere in the file, but we should just fix those bits instead of copying them.

bzbarsky-apple · 2025-04-16T05:45:49Z

src/controller/NetworkRecover.h

+{
+public:
+    NetworkRecover(DeviceCommissioner * commissioner) : mCommissioner(commissioner) {}
+    virtual ~NetworkRecover() {


OK, so what happens if a NetworkRecover is destroyed while its async callbacks are waiting to be called? When those get called we get use-after-free, no?

The NetworkRecover instance will never be destroyed because it is constructed when DeviceController initialize, should be only one NetworkRecover object all the time.

The NetworkRecover instance will never be destroyed because it is constructed when DeviceController initialize

Yes, and DeviceControllers can be destroyed....

github-actions · 2025-04-16T16:54:40Z

PR #38421: Size comparison from fe31731 to b17a037

Full report (3 builds for cc32xx, stm32)

platform	target	config	section	`fe31731`	`b17a037`	change	% change
cc32xx	air-purifier	CC3235SF_LAUNCHXL	FLASH	542142	542334	192	0.0
			RAM	205144	205144	0	0.0
	lock	CC3235SF_LAUNCHXL	FLASH	576170	576298	128	0.0
			RAM	205392	205392	0	0.0
stm32	light	STM32WB5MM-DK	FLASH	461384	461008	-376	-0.1
			RAM	141488	141488	0	0.0

examples/chip-tool/commands/network-recovery/NetworkRecoveryCommand.h

examples/chip-tool/commands/network-recovery/NetworkRecoveryCommand.cpp

examples/chip-tool/commands/network-recovery/Commands.h

src/controller/AutoNetworkRecover.cpp

src/controller/AutoNetworkRecover.h

src/controller/NetworkRecover.cpp

src/controller/NetworkRecover.h

chip-tool example

a510121

github-actions bot added tools examples platform linux controller ble app vscode labels Apr 16, 2025

dengpan-liu changed the title ~~chip-tool example~~ chip-tool newtork recovery example Apr 16, 2025

bzbarsky-apple reviewed Apr 16, 2025

View reviewed changes

Support network recovery discover timeout

5fb5597

dengpan-liu changed the title ~~chip-tool newtork recovery example~~ Linux chip-tool network recovery feature example Apr 16, 2025

recover accidentally modified codes

b17a037

chulspro reviewed Apr 17, 2025

View reviewed changes

fix compiling issues

72f1bdb

github-actions bot added the tizen For Tizen platform label Apr 17, 2025

dengpan-liu added 2 commits April 18, 2025 00:37

Clean CASE before commissioning complete and disarm when step failed

8ccdbfd

fixed github actions issue

2e8820e

github-actions bot added the scripts label Apr 18, 2025

dengpan-liu added 2 commits April 18, 2025 21:04

fixed git actions issue

08b5480

fixed Darwin compile issue

7b3ef3e

github-actions bot added darwin esp32 labels Apr 18, 2025

github-project-automation bot moved this to Todo in [Platform] Darwin Apr 18, 2025

github-project-automation bot added this to [Platform] Darwin Apr 18, 2025

add recover-thread commandcli

c648eb8

dengpan-liu added 4 commits April 21, 2025 17:58

update general commissioning xml

7e28da0

restore xml for another PR

30f962e

fixed recoverable scan timeout then block issue

c7320d5

fix Darwin compiling issue

40aa819

dengpan-liu marked this pull request as ready for review April 23, 2025 05:11

dengpan-liu requested review from joonhaengHeo, a team and arkq as code owners April 23, 2025 05:11

pullapprove bot added the review - pending label Apr 23, 2025

Se-JimYao mentioned this pull request Apr 24, 2025

Network Recovery Feature ESP32 Example #38554

Open

woody-apple added this to the No Target Milestone milestone Apr 29, 2025

This was referenced May 15, 2025

[darwin] Network recovery iOS example #38937

Closed

[darwin] Network recovery iOS example #39051

Draft

Linux chip-tool network recovery feature example #38421

Are you sure you want to change the base?

Linux chip-tool network recovery feature example #38421

Uh oh!

Conversation

dengpan-liu commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Usage:

Testing

Uh oh!

CLAassistant commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 16, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dengpan-liu Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bzbarsky-apple Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dengpan-liu commented Apr 16, 2025 •

edited

Loading

CLAassistant commented Apr 16, 2025 •

edited

Loading

dengpan-liu Apr 16, 2025 •

edited

Loading

bzbarsky-apple Apr 23, 2025 •

edited

Loading