-
Notifications
You must be signed in to change notification settings - Fork 8k
Bluetooth: Mesh: Provisioner closes link on failed #97478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Bluetooth: Mesh: Provisioner closes link on failed #97478
Conversation
The |
According to MshPrt 5.4.4, The Provisioner, upon receiving the Provisioning Failed PDU, shall assume that the provisioning failed and immediately disconnect the provisioning bearer. Signed-off-by: Ludvig Jordet <[email protected]>
Changes the link close upon success to use the `prov_link_close` helper function instead of doing it manually, as minor cleanup. Signed-off-by: Ludvig Jordet <[email protected]>
210402f
to
a93a15a
Compare
send_random(); | ||
} | ||
|
||
static void prov_failed(const uint8_t *data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess after changes, the function just duplicates this: prov_fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of. I mean, the behavior is identical, but one of them is a PDU handler, the other is a generic function that is called from various locations. I'd argue to keep it like this for clarity (handler function vs. actual behavior) in my opinion.
static void prov_failed(const uint8_t *data) | ||
{ | ||
LOG_WRN("Error: 0x%02x", data[0]); | ||
reset_state(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get why provisioner does not clear CDB, state flags and doesn't release Diffie-Hellman key pair upon receiving provisioning failed frame.
Seems incorrect to me.
It should be:
prov_fail(PROV_BEARER_LINK_STATUS_FAIL);
reset_state();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look in the prov_link_closed
callback, the reset is done there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no guarantee this callback is called. For pb gatt this will never cause this callback at all.
Cleaning data in this callback is used mostly for unpredictable link closing like timeout Host for gatt communication and closing over API by application.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even simple case with pd adv will cause zombie node in CDB and heap leakage over mbedtls once mesh cannot find free adv structure, that might happen at any time and depends on customer configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reset_state
will be called after link is closed.
prov_link_close
-> bearer->link_close
:
For PB_ADV:
prov_link_close:pb_adv.c
-> send Link Close PDU -> buf_sent
-> close_link
-> role->link_closed
-> prov_link_closed:provisioner.c
-> reset_state
.
For PB-GATT:
prov_link_close:pb_gatt.c
-> bt_conn_disconnect
-> gatt_disconnected:pb_gatt_srv.c
-> bt_mesh_pb_gatt_close:pb_gatt.c
-> link_closed
-> role->link_closed
-> prov_link_closed:provisioner.c
-> reset_state
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I'm pretty sure this change breaks cleaning provisioner that works above rpr clent. When it receives link report "Closed by device" it doesn't clear cdb and keys anymore
Pretty sure -> can you elaborate because I don't really see this. As I showed above, for PB-Adv, reset_state
is eventually called after sending Link Close PDU. For PB-GATT, Link is a regular BLE link and thus closed through BLE API, once it is closed, reset_state
is called. I don't see how it can not be called. The only difference between old and new behavior is message sending (in case of PB-Adv) and link close (in case of PB-GATT).
Also, PB-GATT is not supported by RPR Client at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you meant error on bt_conn_disconnect
? Then yes, it wasn't actually clear from the comment that this is what you mean. Then yes, we need to fix this, I agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, in case of timeout (which will happen here as there is no communication between devices), reset_state
will eventually be called:
zephyr/subsys/bluetooth/mesh/pb_adv.c
Lines 433 to 442 in cb77257
static void protocol_timeout(struct k_work *work) | |
{ | |
if (!atomic_test_bit(link.flags, ADV_LINK_ACTIVE)) { | |
return; | |
} | |
LOG_DBG(""); | |
link.rx.seg = 0U; | |
prov_link_close(PROV_BEARER_LINK_STATUS_TIMEOUT); |
zephyr/subsys/bluetooth/mesh/pb_gatt.c
Lines 51 to 80 in cb77257
static void link_closed(enum prov_bearer_link_status status) | |
{ | |
const struct prov_bearer_cb *cb = link.cb; | |
void *cb_data = link.cb_data; | |
reset_state(); | |
cb->link_closed(&bt_mesh_pb_gatt, cb_data, status); | |
} | |
static void protocol_timeout(struct k_work *work) | |
{ | |
if (!atomic_test_bit(bt_mesh_prov_link.flags, LINK_ACTIVE)) { | |
return; | |
} | |
/* If connection failed or timeout, not allow establish connection */ | |
if (IS_ENABLED(CONFIG_BT_MESH_PB_GATT_CLIENT) && | |
atomic_test_bit(bt_mesh_prov_link.flags, PROVISIONER)) { | |
if (link.conn) { | |
(void)bt_conn_disconnect(link.conn, | |
BT_HCI_ERR_REMOTE_USER_TERM_CONN); | |
} else { | |
(void)bt_mesh_pb_gatt_cli_setup(NULL); | |
} | |
} | |
LOG_DBG("Protocol timeout"); | |
link_closed(PROV_BEARER_LINK_STATUS_TIMEOUT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I'm pretty sure this change breaks cleaning provisioner that works above rpr clent. When it receives link report "Closed by device" it doesn't clear cdb and keys anymore
Pretty sure -> can you elaborate because I don't really see this. As I showed above, for PB-Adv,
reset_state
is eventually called after sending Link Close PDU. For PB-GATT, Link is a regular BLE link and thus closed through BLE API, once it is closed,reset_state
is called. I don't see how it can not be called. The only difference between old and new behavior is message sending (in case of PB-Adv) and link close (in case of PB-GATT).Also, PB-GATT is not supported by RPR Client at the moment.
You described correctly, but this is applicable only to rpr server. Further rpr server handles pb_link_closed
that will initiate link report with status BT_MESH_RPR_ERR_LINK_CLOSED_BY_DEVICE
(https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/rpr_srv.c#L478)
Then rpr client receives it and calls closed callback for its own provisioner: https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/rpr_cli.c#L74
that will cause calling the same code but already on rpr client side: https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/provisioner.c#L698-L706
you removed reset
functionality here and rely on rpr client bearer callback that is called for some reason again after this PR (it seems weird by self). But follow further:
https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/mesh/rpr_cli.c#L742-L750
this implementation does not assume any calbacks at all. Neither about success nor fail.
Finally, provisioner based on RPR Client gets zombie node and lost keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You described correctly, but this is applicable only to rpr server
Why? provisioner.c runs on RPR Client, not RPR server.
you removed reset functionality here and rely on rpr client bearer callback that is called for some reason again after this PR (it seems weird by self).
This I don't get. This discussion points to reset_state
-> prov_link_close
change in prov_failed
callback which is called when Provisioning Failed PDU is received. I don't understand your sentence really, what does that is called for some reason again after this PR
mean (where is again
)?
In case of RPR, when Provisiong Failed PDU is received, RPR Client will with this new change have this call flow:
prov_link_close:provisioner.c
-> pb_link_close:rpr_cli.c
-> link_close
-> link_timeout
/ handle_link_status
-> bearer.cb->link_closed
which is prov_link_closed:prov.c ->
role->link_closedwhich is
prov_link_closed:provisioner.c->
reset_state`.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though this is not stated in contribution guidelines, it is generally better to start commit head with verb. Approving anyway.
According to MshPrt 5.4.4, The Provisioner, upon receiving the Provisioning Failed PDU, shall assume that the provisioning failed and immediately disconnect the provisioning bearer.
Also changes the link close upon to success to use the prov_link_close helper function instead of doing it manually, as minor cleanup.