@@ -106,16 +106,16 @@ x = m(x)
106106
1071073D 张量并行是一种更高级的并行技术,当扩展到更多设备时,3D 张量并行相比 1D 和 2D 张量并行可进一步减少内存和通信成本。3D 张量并行技术将张量分割成立方体形状,并对第一个和最后一个维度进行划分。对于矩阵乘法 $Y=XW$,给定 $2\times2\times2=8$个处理器,我们把输入 $X$ 和权重 $W$ 分别划分为 $[ X_ {000}\ X_ {001}\ X_ {010}\ X_ {011}\ X_ {100} \ X_ {101}\ X_ {110} \ X_ {111}] $ 和 $[ W_ {000}\ W_ {001}\ W_ {010}\ W_ {011}\ W_ {100} \ W_ {101}\ W_ {110} \ W_ {111}] $,假设 $a,b,c$分别代表矩阵的三个维度,每个 $X_ {abc}$ 和 $W_ {cba}$ 都被存储在 $(a,b,c)$ 的节点上,每个节点上的操作如下表所示
108108
109- | Rank $$ a $$ | Rank $$ b $$ | Rank $$ c $$ | $$ X $$ | $$ W $$ | All Gather (+)$$ X_{ac} $$ | All Gather (+)$$ W_{cb} $$ | Reduce-scatter (-)$$ Y $$ |
109+ | Rank $$ a $$ | Rank $$ b $$ | Rank $$ c $$ | $$ X $$ | $$ W $$ | All Gather (+)$$ X_{ac} $$ | All Gather (+)$$ W_{cb} $$ | Reduce-scatter (-)$$ Y_{abc}=X_{ac}W_{cb} $$ |
110110| --- | --- | --- | --- | --- | --- | --- | --- |
111- | 0 | 0 | 0 | $$ X_{000} $$ | $$ W_{000} $$ | $$ X_{00}=X_{000}+ X_{010} $$ | $$ W_{00}=W_{000}+ W_{001} $$ | $$ Y_{000}=X_{00}W_{00}-X_{01}W_{10 } $$ |
112- | 0 | 0 | 1 | $$ X_{001} $$ | $$ W_{100} $$ | $$ X_{01}=X_{001}+ X_{011} $$ | $$ W_{10}=W_{100}+ W_{101} $$ | $$ Y_{001}=X_{01}W_{10}-X_{00}W_{00 } $$ |
113- | 0 | 1 | 0 | $$ X_{010} $$ | $$ W_{010} $$ | $$ X_{00}=X_{000}+ X_{010} $$ | $$ W_{01}=W_{010}+ W_{011} $$ | $$ Y_{010}=X_{00}W_{01}-X_{01}W_{11 } $$ |
114- | 0 | 1 | 1 | $$ X_{011} $$ | $$ W_{110} $$ | $$ X_{01}=X_{000}+ X_{011} $$ | $$ W_{11}=W_{110}+ W_{111} $$ | $$ Y_{011}=X_{01}W_{11}-X_{00}W_{01 } $$ |
115- | 1 | 0 | 0 | $$ X_{100} $$ | $$ W_{001} $$ | $$ X_{10}=X_{100}+ X_{110} $$ | $$ W_{00}=W_{000}+ W_{001} $$ | $$ Y_{100}=X_{10}W_{00}-X_{11}W_{10 } $$ |
116- | 1 | 0 | 1 | $$ X_{101} $$ | $$ W_{101} $$ | $$ X_{11}=X_{101}+ X_{111} $$ | $$ W_{10}=W_{100}+ W_{101} $$ | $$ Y_{100}=X_{11}W_{10}-X_{10}W_{00 } $$ |
117- | 1 | 1 | 0 | $$ X_{110} $$ | $$ W_{011} $$ | $$ X_{10}=X_{100}+ X_{110} $$ | $$ W_{01}=W_{010}+ W_{011} $$ | $$ Y_{110}=X_{10}W_{01}-X_{11}W_{11 } $$ |
118- | 1 | 1 | 1 | $$ X_{111} $$ | $$ W_{111} $$ | $$ X_{11}=X_{101}+ X_{111} $$ | $$ W_{11}=W_{110}+ W_{111} $$ | $$ Y_{111}=X_{11}W_{11}-X_{10}W_{01 } $$ |
111+ | 0 | 0 | 0 | $$ X_{000} $$ | $$ W_{000} $$ | $$ X_{00}=[ X_{000}, X_{010}] $$ | $$ W_{00}=[ W_{000}, W_{001}] $$ | $$ Y_{000}=X_{00}W_{00} $$ |
112+ | 0 | 0 | 1 | $$ X_{001} $$ | $$ W_{100} $$ | $$ X_{01}=[ X_{001}, X_{011}] $$ | $$ W_{10}=[ W_{100}, W_{101}] $$ | $$ Y_{001}=X_{01}W_{10} $$ |
113+ | 0 | 1 | 0 | $$ X_{010} $$ | $$ W_{010} $$ | $$ X_{00}=[ X_{000}, X_{010}] $$ | $$ W_{01}=[ W_{010}, W_{011}] $$ | $$ Y_{010}=X_{00}W_{01} $$ |
114+ | 0 | 1 | 1 | $$ X_{011} $$ | $$ W_{110} $$ | $$ X_{01}=[ X_{001}, X_{011}] $$ | $$ W_{11}=[ W_{110}, W_{111}] $$ | $$ Y_{011}=X_{01}W_{11} $$ |
115+ | 1 | 0 | 0 | $$ X_{100} $$ | $$ W_{001} $$ | $$ X_{10}=[ X_{100}, X_{110}] $$ | $$ W_{00}=[ W_{000}, W_{001}] $$ | $$ Y_{100}=X_{10}W_{00} $$ |
116+ | 1 | 0 | 1 | $$ X_{101} $$ | $$ W_{101} $$ | $$ X_{11}=[ X_{101}, X_{111}] $$ | $$ W_{10}=[ W_{100}, W_{101}] $$ | $$ Y_{100}=X_{11}W_{10} $$ |
117+ | 1 | 1 | 0 | $$ X_{110} $$ | $$ W_{011} $$ | $$ X_{10}=[ X_{100}, X_{110}] $$ | $$ W_{01}=[ W_{010}, W_{011}] $$ | $$ Y_{110}=X_{10}W_{01} $$ |
118+ | 1 | 1 | 1 | $$ X_{111} $$ | $$ W_{111} $$ | $$ X_{11}=[ X_{101}, X_{111}] $$ | $$ W_{11}=[ W_{110}, W_{111}] $$ | $$ Y_{111}=X_{11}W_{11} $$ |
119119
120120``` python
121121# 并行设置
0 commit comments