Skip to content

How to enable the code on multiple servers? #3

@SmallCoal2001

Description

@SmallCoal2001

We have successfully enabled your code in a stand-alone case. But when we try to enable it between two machines, the compute node will appear bug. In function poll_completion() , compute node appears many times "number 0 got bad completion with status: 0xc, vendor syndrome: 0x81", and then memory node appears "RDMA write failed". We know that the function call order is "dLSM::DBImpl::BackgroundFlush()->dLSM::DBImpl::CompactMemTable()->dLSM::DBImpl::WriteLevel0Table()->dLSM::FlushJob::BuildTable()->dLSM::TableBuilder_ComputeSide::Finish()->dLSM::RDMA_Manager::poll_completion"
How can we fix this bug?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions