prov/efa: Fix asserts for efa-proto#11840
Conversation
prov/efa/src/rdm/efa_rdm_ope.c
Outdated
| * Allow local iov count to be equal to 0 b/c bounce buffer's pre-registered buff/desc | ||
| * will be passed to rdma-core | ||
| */ | ||
| assert(ope->iov_count <= efa_rdm_ep_domain(ep)->info->tx_attr->iov_limit); |
There was a problem hiding this comment.
| assert(ope->iov_count <= efa_rdm_ep_domain(ep)->info->tx_attr->iov_limit); | |
| assert(ope->iov_count <= efa_rdm_ep_domain(ep)->info->tx_attr->iov_limit || (ope->iov_count == 0 && ope->bytes_read_total_len == 0)); |
There was a problem hiding this comment.
The second part of the assert is not needed because if iov_count == 0, the left side will always be true, making the right side never trigger.
I think you were going for:
#if ENABLE_DEBUG
if (ope->iov_count == 0)
assert(ope->bytes_read_total_len == 0)
#endif
There was a problem hiding this comment.
Yeah, you're right. Or you can flip the order so that the iov_count == 0 gets checked first
I wanted to keep the assert that if iov_count is 0, the total length is also 0
prov/efa/src/rdm/efa_rdm_ope.c
Outdated
| * Allow local iov count to be equal to 0 b/c bounce buffer's pre-registered buff/desc | ||
| * will be passed to rdma-core | ||
| */ | ||
| assert(ope->iov_count <= efa_rdm_ep_domain(ep)->info->tx_attr->iov_limit); |
There was a problem hiding this comment.
| assert(ope->iov_count <= efa_rdm_ep_domain(ep)->info->tx_attr->iov_limit); | |
| assert(ope->iov_count <= efa_rdm_ep_domain(ep)->info->tx_attr->iov_limit || (ope->iov_count == 0 && ope->bytes_write_total_len == 0)); |
There was a problem hiding this comment.
The second part of the assert is not needed because if iov_count == 0, the left side will always be true, making the right side never trigger.
I think you were going for:
#if ENABLE_DEBUG
if (ope->iov_count == 0)
assert(ope->bytes_write_total_len == 0)
#endif
There was a problem hiding this comment.
if we use ofi_total_iov_len to calculate bytes_write_total_len, iov_count == 0 already means bytes_write_total_len=0... I do not think it is more useful to add that additional asserts though
1d31d6a to
e3f6709
Compare
prov/efa/src/efa_rma.c
Outdated
| msg->context, msg->addr, flags, FI_RMA | FI_WRITE); | ||
|
|
||
| /* Prepare SGE list */ | ||
| assert(msg->iov_count > 0); |
There was a problem hiding this comment.
It still doesn't address my initial concern.... that assert is for checking something you are sure to happen inside the code, but here iov_count is something parsed by application. I think we either need to make efa-direct support 0 iov_count, or return error.
There was a problem hiding this comment.
sorry, I didn't see where this initial concern was commented. Will fix.
Allow user to pass in null buff/null descriptor for efa-protocol for read/write without hitting an assert. Signed-off-by: Seth Zegelstein <szegel@amazon.com>
e3f6709 to
db52e68
Compare
|
@shijin-aws, I simplified this PR to only be for efa-proto, and I will post a follow up PR for efa-direct. |
Allow user to pass in null buff/null descriptor for efa-protocol for read/write without hitting an assert.