Commit 85aa477
committed
UCT/IB/RDMACM: Take async block around unprotected librdmacm calls
The cm event handler runs under the cm's async block (held by
ucs_async_handler_dispatch via context_try_block). rdma_destroy_id and
related librdmacm state-mutating calls must take the same block to
serialize with rdma_get_cm_event() running in the handler; otherwise a
concurrent destroy can remove the cm_id's userspace tracking while the
handler is mid-lookup inside librdmacm, producing a NULL deref at
pthread_mutex_lock(&id_priv->mut) inside rdma_get_cm_event().
The ep destructor and the listener destructor already take the block.
Four other call sites did not:
- rdmacm_listener.c listener init error path (rdma_destroy_id on
rdma_bind_addr / rdma_listen failure)
- rdmacm_listener.c uct_rdmacm_listener_reject (rdma_reject +
rdma_destroy_id + rdma_ack_cm_event on a connect_request)
- rdmacm_cm_ep.c client init error path (rdma_destroy_id on
rdma_resolve_addr failure)
- rdmacm_cm_ep.c server init (rdma_ack_cm_event + rdma_migrate_id,
plus rdma_destroy_id on the error path; the success-path
server_send_priv_data already takes the same (recursive) block)
Observed as a SIGSEGV inside librdmacm during multi-threaded sockaddr
gtests (test_ucp_sockaddr_destroy_ep_on_err, test_ucp_sockaddr_wireup_fail)
where ep teardown and connect_request rejection interleave with event
delivery on the same channel.
Signed-off-by: NirWolfer <nwolfer@nvidia.com>1 parent 2e11735 commit 85aa477
2 files changed
Lines changed: 19 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
522 | 522 | | |
523 | 523 | | |
524 | 524 | | |
| 525 | + | |
525 | 526 | | |
| 527 | + | |
526 | 528 | | |
527 | 529 | | |
528 | 530 | | |
| |||
587 | 589 | | |
588 | 590 | | |
589 | 591 | | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
590 | 597 | | |
591 | 598 | | |
592 | 599 | | |
| |||
621 | 628 | | |
622 | 629 | | |
623 | 630 | | |
| 631 | + | |
624 | 632 | | |
625 | 633 | | |
626 | 634 | | |
627 | 635 | | |
628 | 636 | | |
629 | 637 | | |
| 638 | + | |
630 | 639 | | |
631 | 640 | | |
632 | 641 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
| 109 | + | |
109 | 110 | | |
| 111 | + | |
110 | 112 | | |
111 | 113 | | |
112 | 114 | | |
| |||
119 | 121 | | |
120 | 122 | | |
121 | 123 | | |
| 124 | + | |
122 | 125 | | |
123 | 126 | | |
124 | 127 | | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
125 | 131 | | |
126 | 132 | | |
127 | | - | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
128 | 137 | | |
129 | 138 | | |
130 | 139 | | |
| |||
0 commit comments