rxe: Fix dma.length computation in wr_set_sge_list#1744
Open
jeholza wants to merge 1 commit into
Open
Conversation
wr_set_sge_list() summed the SGE lengths with a loop that never advanced sg_list: while (num_sge--) tot_length += sg_list->length; so tot_length ended up as num_sge * sg_list[0].length instead of the true sum, and wqe->dma.length / wqe->dma.resid were written with that wrong value. The per-SGE entries themselves were unaffected because they are populated by the preceding memcpy(). The kernel rxe driver requires dma.length == sum(sge[i].length) and enforces it in rxe_mr.c:copy_data(), so a multi-SGE WR posted through the ibv_qp_ex builder API (ibv_wr_set_sge_list) on rxe completes with IB_WC_LOC_PROT_ERR once finish_packet()/copy_data() runs off the end of the SGE list. The legacy ibv_post_send path (init_send_wqe) is unaffected; it sums the lengths with an indexed for loop. Fix by computing the total with an indexed loop, matching the style already used in rxe_post_one_recv() and init_send_wqe() in this file. Fixes: 1a894ca ("Providers/rxe: Implement ibv_create_qp_ex verb") Signed-off-by: Jared Holzman <jholzman@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
wr_set_sge_list() summed the SGE lengths with a loop that never advanced sg_list:
so tot_length ended up as num_sge * sg_list[0].length instead of the true sum, and wqe->dma.length / wqe->dma.resid were written with that wrong value. The per-SGE entries themselves were unaffected because they are populated by the preceding memcpy().
The kernel rxe driver requires dma.length == sum(sge[i].length) and enforces it in rxe_mr.c:copy_data(), so a multi-SGE WR posted through the ibv_qp_ex builder API (ibv_wr_set_sge_list) on rxe completes with IB_WC_LOC_PROT_ERR once finish_packet()/copy_data() runs off the end of the SGE list.
The legacy ibv_post_send path (init_send_wqe) is unaffected; it sums the lengths with an indexed for loop.
Fix by computing the total with an indexed loop, matching the style already used in rxe_post_one_recv() and init_send_wqe() in this file.
Fixes: 1a894ca ("Providers/rxe: Implement ibv_create_qp_ex verb")