Skip to content

An example to help understanding how does EPLB work #12

Open
@JYXL

Description

@JYXL

Base on the example, if we want to reassign the 12 experts into 8 GPUs of 2 nodes for layer1, how to reassign experts to make balance for each GPU?
There are three steps to assign experts:
Step1: inter-node balance: Divide experts into 4 groups(0-2, 3-5, 6-8, 9-11), and assign the 4 groups into 2 nodes, ensuring inter-node balance, which can be seen as a backpack problem and solved using greedy algorithms.
Step2: expert balance: Replicate the hot experts, which within each node(4,5;1,10).
Step3: intra-node balance: Pack the replicated experts to individual GPUs(intra-node) to ensure different GPUs are load-balanced, which can alse be seen as a backpack problem.
The entire process can be called as Hierarchical Load Balancing as follow:
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions