-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
We have successfully enabled your code in a stand-alone case. But when we try to enable it between two machines, the compute node will appear bug. In function poll_completion() , compute node appears many times "number 0 got bad completion with status: 0xc, vendor syndrome: 0x81", and then memory node appears "RDMA write failed". We know that the function call order is "dLSM::DBImpl::BackgroundFlush()->dLSM::DBImpl::CompactMemTable()->dLSM::DBImpl::WriteLevel0Table()->dLSM::FlushJob::BuildTable()->dLSM::TableBuilder_ComputeSide::Finish()->dLSM::RDMA_Manager::poll_completion"
How can we fix this bug?
Metadata
Metadata
Assignees
Labels
No labels