Skip to content

Feature/zwg 3433/watchdog integration#35

Merged
rzr merged 3 commits into
SiliconLabs:mainfrom
silabs-tuD:feature/ZWG-3433/watchdog_integration
Sep 29, 2025
Merged

Feature/zwg 3433/watchdog integration#35
rzr merged 3 commits into
SiliconLabs:mainfrom
silabs-tuD:feature/ZWG-3433/watchdog_integration

Conversation

@silabs-tuD

@silabs-tuD silabs-tuD commented Apr 22, 2025

Copy link
Copy Markdown
Contributor

Implement the watchdog feature for Z/IP Gateway working as below:

  • Create a “soft_reset_timer” with 2-minnute timeout.
  • During 2 mins, if Z/IP GW only receive TRANSMIT_COMPLETE_FAIL status, it will try to resend the previous failed serial-API command to the controller.
  • After timeout (2 mins), Z/IP GW will call ZW_SoftReset() to reset the controller and wait for 10 seconds to ensure the reset process is successful.
  • Z/IP GW will also send a soft_reset report back to ZWare.

Fixes: #ZGW-3433

Co-Authored-by:

  1. silabs-hanguyen3 Ha.Nguyen3@silabs.com: Original solutions.
  2. silabs-tuD tu.dao@silabs.com: Improvement of NCP Soft Reset Status check and report to zware

Reviewed-by: silabs-tuD tu.dao@silabs.com

silabs-tuD added a commit to silabs-tuD/zipgateway that referenced this pull request Apr 22, 2025
Implement the watchdog feature for Z/IP Gateway working as below:
- Create a “soft_reset_timer” with 2-minnute timeout.
- During 2 mins, if Z/IP GW only receive TRANSMIT_COMPLETE_FAIL status, it will try to resend the previous failed serial-API command to the controller.
- After timeout (2 mins), Z/IP GW will call ZW_SoftReset() to reset the controller and wait for 10 seconds to ensure the reset process is successful.
- Z/IP GW will also send a soft_reset report back to ZWare.

Fixes: #ZGW-3433
Signed-off-by: silabs-tuD <tu.dao@silabs.com>
Authored-by: hanguyen3_silabs <Ha.Nguyen3@silabs.com>
Reviewed-by: silabs-tuD <tu.dao@silabs.com>
Origin: SiliconLabs#35
@silabs-tuD silabs-tuD force-pushed the feature/ZWG-3433/watchdog_integration branch from 611d94e to c2b62fa Compare April 22, 2025 10:38
Comment thread contiki/platform/linux/parse_config.c Outdated
Comment thread src/ZW_classcmd_ex.h Outdated
Comment thread src/ZW_udp_server.c Outdated
Comment thread src/ZW_udp_server.c Outdated
silabs-tuD added a commit to silabs-tuD/zipgateway that referenced this pull request May 8, 2025
* FIX ZGW-3422: zipgateway got stuck in RD_probe_cc_version.c

+ Fixed zipgw stuck if controller send VERSION_COMMAND_CLASS_GET in the inclusion phase without receiving VERSION_COMMAND_CLASS_REPORT from the end-node
+ Put the NULL checking of pCmd first; added extra NULL checking for ep & ep->node

Fixes: #ZGW-3422
Signed-off-by: silabs-tuD <tu.dao@silabs.com>
Reviewed-on: SiliconLabs#34

* ZGW-3422: Z/IP got stuck after sending VERSION_COMMAND_CLASS_GET without receving report from the end-node (#1)

* ZGW-3422: Fix: Z/IP GW got stuck without receiving VERSION_COMMAND_CLASS_REPORT from the end-node

* Put the NULL checking of pCmd first; added extra NULL checking for ep & ep->node

* Fix/zgw 3422 getting stuck on command class version (#2)

+ Fixed errors on ctest with the Null Check of pCmd

* Z/IP Gateway Watchdog

Implement the watchdog feature for Z/IP Gateway working as below:
- Create a “soft_reset_timer” with 2-minnute timeout.
- During 2 mins, if Z/IP GW only receive TRANSMIT_COMPLETE_FAIL status, it will try to resend the previous failed serial-API command to the controller.
- After timeout (2 mins), Z/IP GW will call ZW_SoftReset() to reset the controller and wait for 10 seconds to ensure the reset process is successful.
- Z/IP GW will also send a soft_reset report back to ZWare.

Fixes: #ZGW-3433
Signed-off-by: silabs-tuD <tu.dao@silabs.com>
Authored-by: hanguyen3_silabs <Ha.Nguyen3@silabs.com>
Reviewed-by: silabs-tuD <tu.dao@silabs.com>
Origin: SiliconLabs#35

* Resolve comments on PR

---------

Signed-off-by: silabs-tuD <tu.dao@silabs.com>
silabs-tuD added a commit to silabs-tuD/zipgateway that referenced this pull request May 28, 2025
* FIX ZGW-3422: zipgateway got stuck in RD_probe_cc_version.c

+ Fixed zipgw stuck if controller send VERSION_COMMAND_CLASS_GET in the inclusion phase without receiving VERSION_COMMAND_CLASS_REPORT from the end-node
+ Put the NULL checking of pCmd first; added extra NULL checking for ep & ep->node

Fixes: #ZGW-3422
Signed-off-by: silabs-tuD <tu.dao@silabs.com>
Reviewed-on: SiliconLabs#34

* ZGW-3422: Z/IP got stuck after sending VERSION_COMMAND_CLASS_GET without receving report from the end-node (#1)

* ZGW-3422: Fix: Z/IP GW got stuck without receiving VERSION_COMMAND_CLASS_REPORT from the end-node

* Put the NULL checking of pCmd first; added extra NULL checking for ep & ep->node

* Fix/zgw 3422 getting stuck on command class version (#2)

+ Fixed errors on ctest with the Null Check of pCmd

* Z/IP Gateway Watchdog

Implement the watchdog feature for Z/IP Gateway working as below:
- Create a “soft_reset_timer” with 2-minnute timeout.
- During 2 mins, if Z/IP GW only receive TRANSMIT_COMPLETE_FAIL status, it will try to resend the previous failed serial-API command to the controller.
- After timeout (2 mins), Z/IP GW will call ZW_SoftReset() to reset the controller and wait for 10 seconds to ensure the reset process is successful.
- Z/IP GW will also send a soft_reset report back to ZWare.

Fixes: #ZGW-3433
Signed-off-by: silabs-tuD <tu.dao@silabs.com>
Authored-by: hanguyen3_silabs <Ha.Nguyen3@silabs.com>
Reviewed-by: silabs-tuD <tu.dao@silabs.com>
Origin: SiliconLabs#35

* Resolve comments on PR

---------

Signed-off-by: silabs-tuD <tu.dao@silabs.com>
Implement the watchdog feature for Z/IP Gateway working as below:
- Create a “soft_reset_timer” with 2-minnute timeout.
- During 2 mins, if Z/IP GW only receive TRANSMIT_COMPLETE_FAIL status, it will try to resend the previous failed serial-API command to the controller.
- After timeout (2 mins), Z/IP GW will call ZW_SoftReset() to reset the controller and wait for 10 seconds to ensure the reset process is successful.
- Z/IP GW will also send a soft_reset report back to ZWare.

Fixes: #ZGW-3433
Signed-off-by: silabs-tuD <tu.dao@silabs.com>
Authored-by: hanguyen3_silabs <Ha.Nguyen3@silabs.com>
Reviewed-by: silabs-tuD <tu.dao@silabs.com>
Origin: SiliconLabs#35

Resolve comments on PR

resolve conflict for the second times
@silabs-tuD silabs-tuD force-pushed the feature/ZWG-3433/watchdog_integration branch from 48e749d to e85c5cd Compare May 29, 2025 16:17

#define DATA_DIR INSTALL_LOCALSTATEDIR "/lib/" PACKAGE_TARNAME "/"

// Time in seconds to resend the serial API to the controller

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Time in seconds to resend the serial API to the controller
/// Time in seconds to resend the serial API to the controller

Comment thread src/ZW_classcmd_ex.h Outdated
#define COMMAND_CLASS_NO_OPERATION_LR 0x04


// Soft Reset Status

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Soft Reset Status
/// Soft Reset Status

For doxygen

@rzr rzr left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be merged ?

Has anyone else tested it ?

@silabs-tuD

Copy link
Copy Markdown
Contributor Author

@silabs-hanguyen3 , could you please ask the customer again to see whether we can merge this PR?

// Time in seconds to resend the serial API to the controller
#define MIN_TIME_RESEND_SERIAL 20
#define MAX_TIME_RESEND_SERIAL 80
#define DEFAULT_TIME_RESEND_SERIAL 50

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to preserve the original behavior should it it be set to 0?

hints about the values would help too.

Implement a mechanism to check the Wakeup Reason of NCP Controller is "software reset" to guarantee the Z/IP GW can successfully "soft-reset" the NCP controller via S-API.
- In the past, checking the serial communication by getting ZW_VERSION of the NCP controller cannot gurantee the soft_reset command is successfully done.
- The proper way described in the "Z-Wave host API specification" as below:
  + After the host (z/ip gw) send S-API soft reset command (FUNC_ID_SERIAL_API_SOFT_RESET-0x08) to the NCP controller, then the host (z/ip gw) check if the received frame is the unsolicited frame "Z-Wave API Started Command - FUNC_ID_SERIAL_API_STARTED (0x0A)".
  + If the NCP controller has been reset successfully, it will return the above unsolicited frame with byte-5 data as SOFT_RESET_WAKEUP_REASON (0x07).
Motivation: In some cases, the application layer (like z-ware) really care about the NCP controller's soft reset status report. Especially, the "failed soft-reset status" MUST be sent back to zware, so the user can press the hard reset button the whole system.
- We send the 'soft-reset status report' to the unsolicited destinations (remote IPs + Ports), z-ware must always listen on the unsolicited destination.
- Since there are no specific comamnd in the Z/IP Packet Command class to notify the health check status of the NCP Controller, we implemented a custom COMMAND_ZIP_CONTROLLER_STATUS_REPORT (0x04) with the payload is status bytes (RESET_OK: 0x00; RESET_FAIL: 0xFF)
- So, this new command also requires z-ware (zip clients) must understand and support our custom frame

Fixes: #<ZGW-3450>
Signed-off-by: silabs-tuD <tu.dao@silabs.com>
@silabs-tuD

Copy link
Copy Markdown
Contributor Author

I simulated the failed soft reset status, and the Z/IP GW can send the "failed status report" to the z-ware -this case is controller_app.c-(a zip client). Since the old DTLS session between Z/IP GW vs Z-Ware has been expired, so a new DTLS session has been allocated and the report can be sent successfully.

zgw-notify-ncp-controller-failed-reset-status-via-unsolicited-dest

@rzr

rzr commented Sep 24, 2025

Copy link
Copy Markdown
Contributor

Any objection to merge this ?

@rzr rzr merged commit 1835543 into SiliconLabs:main Sep 29, 2025
1 check passed
rzr pushed a commit that referenced this pull request Sep 29, 2025
Implement the watchdog feature for Z/IP Gateway working as below:
- Create a “soft_reset_timer” with 2-minnute timeout.
- During 2 mins, if Z/IP GW only receive TRANSMIT_COMPLETE_FAIL status, it will try to resend the previous failed serial-API command to the controller.
- After timeout (2 mins), Z/IP GW will call ZW_SoftReset() to reset the controller and wait for 10 seconds to ensure the reset process is successful.
- Z/IP GW will also send a soft_reset report back to ZWare.

Fixes: #ZGW-3433
Signed-off-by: silabs-tuD <tu.dao@silabs.com>
Authored-by: hanguyen3_silabs <Ha.Nguyen3@silabs.com>
Reviewed-by: silabs-tuD <tu.dao@silabs.com>
Origin: #35

Resolve comments on PR

resolve conflict for the second times
harshithChukka added a commit to harshithChukka/zipgateway that referenced this pull request May 27, 2026
Implement the watchdog feature for Z/IP Gateway working as below:
- Create a “soft_reset_timer” with 2-minnute timeout.
- During 2 mins, if Z/IP GW only receive TRANSMIT_COMPLETE_FAIL status, it will try to resend the previous failed serial-API command to the controller.
- After timeout (2 mins), Z/IP GW will call ZW_SoftReset() to reset the controller and wait for 10 seconds to ensure the reset process is successful.
- Z/IP GW will also send a soft_reset report back to ZWare.

Fixes: #ZGW-3433
Signed-off-by: silabs-tuD <tu.dao@silabs.com>
Authored-by: hanguyen3_silabs <Ha.Nguyen3@silabs.com>
Reviewed-by: silabs-tuD <tu.dao@silabs.com>
Origin: SiliconLabs#35

Resolve comments on PR

resolve conflict for the second times
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants