|
| 1 | +### 姓名 |
| 2 | + |
| 3 | +刘卉杰 |
| 4 | + |
| 5 | +### 实习项目 |
| 6 | + |
| 7 | +自动并行切分转换和专家并行机制完善 |
| 8 | + |
| 9 | +### 本周工作 |
| 10 | + |
| 11 | + |
| 12 | + |
| 13 | +1. 扩展了 _reshard_mesh_shape对Partial的测试(原有Replicate) |
| 14 | +2. 扩展了 _reshard_mesh_shape对Shard的部分_local_value不变的情况的测试,改进了_only_reshard_mesh_shape判断函数 |
| 15 | + |
| 16 | +代码对Paddle的修改部分和实验测试在我的github托管仓库 [Moe/pd-dist at main · smile2game/Moe。](https://github.com/smile2game/Moe/tree/main/pd-dist) |
| 17 | + |
| 18 | + |
| 19 | + |
| 20 | +针对Shard的_only_reshard_mesh_shape具体修改和原理: |
| 21 | + |
| 22 | +- 属于only_reshard_mesh_shape |
| 23 | + |
| 24 | +1. |
| 25 | + |
| 26 | + 1. src_mesh的切分,实际上第二刀没有切出来,因为2x1的1切了和没切是一样的,dst_mesh是[2]。我所补充在_only_reshard_mesh_shape中补充的也是针对这种的情况 |
| 27 | + |
| 28 | + |
| 29 | +<br/> |
| 30 | + |
| 31 | +- 不属于的情况 |
| 32 | + |
| 33 | +1. [[0], [1]] Shard(0),Shard(1) --> [[0,1]] Shard(0) Shard(1) |
| 34 | + |
| 35 | + 1. 按行切分变成按列切分了 |
| 36 | + |
| 37 | + |
| 38 | +<br/> |
| 39 | + |
| 40 | +- 未覆盖的情况 |
| 41 | + |
| 42 | +1. [[0], [1]] Shard(0),Shard(0) --> [0,1] Shard(0) |
| 43 | + |
| 44 | + 1. 目前这种mesh在dim_0同时切两刀,会报warning: not supported |
| 45 | + |
| 46 | + |
| 47 | +```python |
| 48 | +src_len = len(src_placements) |
| 49 | +dst_len = len(placements) |
| 50 | +print(f"src_mesh.shape[1] is {src_mesh.shape[1]}") |
| 51 | +if src_len >= dst_len: |
| 52 | + print(f"src_len is {src_len},dst_len is {dst_len}") |
| 53 | + for i in range(dst_len): |
| 54 | + if src_mesh.shape[i] != mesh.shape[i]: |
| 55 | + return False |
| 56 | + for i in range(dst_len,src_len): |
| 57 | + if src_mesh.shape[i] != 1: |
| 58 | + return False |
| 59 | +else: |
| 60 | + for i in range(src_len): |
| 61 | + if src_mesh.shape[i] != mesh.shape[i]: |
| 62 | + return False |
| 63 | + for i in range(src_len,dst_len): |
| 64 | + if dst_mesh.shape[i] != 1: |
| 65 | + return False |
| 66 | +``` |
| 67 | + |
| 68 | + |
| 69 | + |
| 70 | + |
| 71 | + |
| 72 | +### 存在的问题 |
| 73 | + |
| 74 | +_only_reshard_mesh_shape的完备性还需要和老师继续确认 |
| 75 | + |
| 76 | +### 下周工作 |
| 77 | + |
| 78 | +1. 继续完善 _reshard_mesh_shape的情况 |
| 79 | + |
| 80 | + |
| 81 | + |
| 82 | +### 导师点评 |
| 83 | + |
| 84 | +通过 |
| 85 | + |
0 commit comments