From 6de71ded06a7967ba1f0176c31e6e822037bc13b Mon Sep 17 00:00:00 2001 From: zeroRains Date: Sat, 27 Jul 2024 13:46:38 +0800 Subject: [PATCH 1/7] add weekly report --- .../[WeeklyReports]2024.05.11~2024.05.24.md | 111 ++++++++++++++++++ 1 file changed, 111 insertions(+) create mode 100644 WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.05.11~2024.05.24.md diff --git a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.05.11~2024.05.24.md b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.05.11~2024.05.24.md new file mode 100644 index 00000000..66c65e98 --- /dev/null +++ b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.05.11~2024.05.24.md @@ -0,0 +1,111 @@ +### 姓名 + +卢林军 + +### 实习项目 + +组合机制建设和机制推全 + +### 本周工作 + +本项目的主要工作是对尚未支持组合机制的算子添加组合机制并完善机制,本周主要工作如下: + +1. 为sum和mean op的反向拆解添加动态shape支持 + + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/64789 + +2. 完善Reduce_as 算子计算方式 + +之前reduce_as op的实现是在计算好reduce_dim之后将reduce_dim作为输入传给reduce_sum kernel。当reduce_dim为空数组时,reduce_sum会默认执行reduce_all的计算,但是在reduce_as op中,当reduce_dim为空时,期望是不对输入做任何操作。 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/65002 + +3. 为add, subtract, multiply和 divide op的反向拆解添加动态shape支持 + +同时修复了之前写法导致算子性能下降的BUG,修复multiply_grad的反向拆解在一个输入动态shape另一个输入是静态shape的场景下出现的bug,同时为同类算子新增对应的单测 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/65003 +- https://github.com/PaddlePaddle/Paddle/pull/65005 +- https://github.com/PaddlePaddle/Paddle/pull/65007 +- https://github.com/PaddlePaddle/Paddle/pull/65006 +- https://github.com/PaddlePaddle/Paddle/pull/65357 +- https://github.com/PaddlePaddle/Paddle/pull/65643 + +5. 为concat_grad添加动态shape支持,添加split_grad动态shape的单测 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/65148 + +6. 为relu_grad的反向拆解过程支持动态shape,添加relu_grad和sigmoid_grad的动态shape单测 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/65482 +- https://github.com/PaddlePaddle/Paddle/pull/65832 + +7. 完善get_reduce_dims_from_out函数功能 + +目前发现get_reduce_dims_from_out在某些动态shape场景下也能正常工作,但是函数设计的本身目的是用于处理静态shape的场景,因此在该函数中添加了动态shap的检测 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/65666 + +8. 为elementwise_pow_grad添加动态shape支持,添加pow_grad动态shape的单测 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/65692 + +9. 为softmax_grad添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/65961 + +10. 前向拆解lerp op并添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/65967 + +11. 前向拆解log_loss op并添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/65968 + +12. 为动态shape场景,添加GetOutputDimsForDynamicShape函数 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/66172 + +13. 修复stack op在旧IR组合机制下的BUG + +当Tensor不开启梯度计算时,在旧IR下梯度初始化似乎不是初始化为空值,而是初始化成一个维度为[]的Tensor,这旧导致了其会进入到反向计算prim拆解的流程中,而其中涉及到了一个对梯度(维度为[])reshape操作,这就导致了这个测试失败。因此需要在初始化变量时开启梯度计算 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/66457 + +14. 前向拆解kldiv_loss op,并支持动态shape + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/66510 + +### 下周工作 + +1. 收集待拆解算子并为其支持组合机制 + +### 导师点评 + From 1402b21b07740bdb5bd96914b34a75e48c9fe6a8 Mon Sep 17 00:00:00 2001 From: zeroRains Date: Sat, 27 Jul 2024 13:51:31 +0800 Subject: [PATCH 2/7] modify the file name --- ...5.11~2024.05.24.md => [WeeklyReports]2024.07.15~2024.07.28.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename WeeklyReports/Hackathon_7th/04_zeroRains/{[WeeklyReports]2024.05.11~2024.05.24.md => [WeeklyReports]2024.07.15~2024.07.28.md} (100%) diff --git a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.05.11~2024.05.24.md b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.15~2024.07.28.md similarity index 100% rename from WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.05.11~2024.05.24.md rename to WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.15~2024.07.28.md From a3c3536e66c87b6fe0aa483e009d52ee5f2cad77 Mon Sep 17 00:00:00 2001 From: cyber-pioneer <116002591+cyber-pioneer@users.noreply.github.com> Date: Mon, 29 Jul 2024 10:34:45 +0800 Subject: [PATCH 3/7] Update WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.15~2024.07.28.md --- .../04_zeroRains/[WeeklyReports]2024.07.15~2024.07.28.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.15~2024.07.28.md b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.15~2024.07.28.md index 66c65e98..2d4fe747 100644 --- a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.15~2024.07.28.md +++ b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.15~2024.07.28.md @@ -108,4 +108,4 @@ 1. 收集待拆解算子并为其支持组合机制 ### 导师点评 - +靠谱,高效 From 03c15999f8c80d59ea7b1ca556d2e15a0e66ef16 Mon Sep 17 00:00:00 2001 From: cyber-pioneer <116002591+cyber-pioneer@users.noreply.github.com> Date: Mon, 12 Aug 2024 21:37:56 +0800 Subject: [PATCH 4/7] Update WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md --- .../04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md index 8cab5c80..56334870 100644 --- a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md +++ b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md @@ -41,4 +41,4 @@ ### 导师点评 - +Nice From 82a34c2c0d6eb9e25a67309fd993ffddc6b26df4 Mon Sep 17 00:00:00 2001 From: zeroRains Date: Mon, 26 Aug 2024 16:33:34 +0800 Subject: [PATCH 5/7] add weeklyreports --- .../[WeeklyReports]2024.07.29~2024.08.11.md | 4 -- .../[WeeklyReports]2024.08.12~2024.08.25.md | 44 +++++++++++++++++++ 2 files changed, 44 insertions(+), 4 deletions(-) create mode 100644 WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.08.12~2024.08.25.md diff --git a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md index cf0876a3..56334870 100644 --- a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md +++ b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.07.29~2024.08.11.md @@ -41,8 +41,4 @@ ### 导师点评 -<<<<<<< HEAD - -======= Nice ->>>>>>> 5f903d04cee7dd6955f0ed5b1f968da18d2c5e6c diff --git a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.08.12~2024.08.25.md b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.08.12~2024.08.25.md new file mode 100644 index 00000000..13a85a20 --- /dev/null +++ b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.08.12~2024.08.25.md @@ -0,0 +1,44 @@ +### 姓名 + +卢林军 + +### 实习项目 + +组合机制建设和机制推全 + +### 本周工作 + +本项目的主要工作是对尚未支持组合机制的算子添加组合机制并完善机制,本周主要工作如下: + +1. 为expand_grad op 添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/67481 + +2. 为stack_grad op 添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/67502 + +3. 为pad_grad 添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/67606 + +4. 为scale_grad, square_grad, transpose_grad, swiglu_grad添加动态shape单测,并按照字母序重新整理单测 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/67590 + + +### 下周工作 + +1. batch_norm,prod,dropout反向适配动态shape + +### 导师点评 + + From 312b4f67536f4c87d57ade3f598348efb613de6b Mon Sep 17 00:00:00 2001 From: zeroRains Date: Sun, 29 Sep 2024 15:38:15 +0800 Subject: [PATCH 6/7] add weekly report --- .../[WeeklyReports]2024.09.09~2024.09.23.md | 57 +++++++++++++++++++ 1 file changed, 57 insertions(+) create mode 100644 WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.09.09~2024.09.23.md diff --git a/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.09.09~2024.09.23.md b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.09.09~2024.09.23.md new file mode 100644 index 00000000..2cb54b96 --- /dev/null +++ b/WeeklyReports/Hackathon_7th/04_zeroRains/[WeeklyReports]2024.09.09~2024.09.23.md @@ -0,0 +1,57 @@ +### 姓名 + +卢林军 + +### 实习项目 + +组合机制建设和机制推全 + +### 本周工作 + +本项目的主要工作是对尚未支持组合机制的算子添加组合机制并完善机制,本周主要工作如下: + +1. 为roll_grad op 添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/68272 + +2. 为one_hot op 添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/68295 + +3. 为batch_norm 与 batch_norm_ op 添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/68353 + +4. 为bmm op 添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/68357 + +5. 为layer_norm_grad op 添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/68274 + +6. 为group_norm_grad op 添加动态shape支持 + +相关 PR: + +- https://github.com/PaddlePaddle/Paddle/pull/68271 + + +### 下周工作 + +1. 修复group_norm_grad与layer_norm_grad pr的BUG + +### 导师点评 + + + From 21ad337524e6cb4807f0d2ea94d42a9a511a8d9b Mon Sep 17 00:00:00 2001 From: zeroRains Date: Sun, 18 May 2025 19:31:10 +0800 Subject: [PATCH 7/7] add weekly report --- .../[WeeklyReports]2025.4.28~2025.5.16.md | 44 +++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 WeeklyReports/Hackathon_8th/07_zeroRains/[WeeklyReports]2025.4.28~2025.5.16.md diff --git a/WeeklyReports/Hackathon_8th/07_zeroRains/[WeeklyReports]2025.4.28~2025.5.16.md b/WeeklyReports/Hackathon_8th/07_zeroRains/[WeeklyReports]2025.4.28~2025.5.16.md new file mode 100644 index 00000000..c7aff73d --- /dev/null +++ b/WeeklyReports/Hackathon_8th/07_zeroRains/[WeeklyReports]2025.4.28~2025.5.16.md @@ -0,0 +1,44 @@ +### 姓名 + +卢林军 + +### 实习项目 + +大语言模型推理&服务化易用性提升专项 + +### 本周工作 + +本项目的主要工作是优化当前PaddleNLP大模型推理服务调用,本周主要工作如下: + + +1. 分析MOE架构中Expert计算部分,采用量化和不采用量化产生较大精度差异的原因 + +当前MOE架构中使用Cutlass编写的DeGEMM kernel处理反量化的expert计算,主要问题在于使用的是基本的DefaultScaleIterators去遍历量化后的Scale,目前是通过参考fpA_intB_GEMM 以及TRT-LLM的做法添加了scale iterator的设置,使用FineGrainedScaleZeroIterator处理group_size != -1的情况。 + +从DeepSeekV2推理的结果来看,基本解决了之前直接传入group_size参数出现的无法输出结束符的问题,但仍然在一些case上会有一些乱码的情况。 + +从数值结果上看,int8的量化和不量化的计算结果比较接近,int4的量化计算结果与不量化的计算结果仍然存在差距 + +相关 PR: + +- https://github.com/PaddlePaddle/PaddleNLP/pull/10174 + +2. 学习fastsafetensors并使其兼容paddle调用 + +了解fastsafetensors的基本工作流程,目前已支持paddle在single模式下的Tensor加载,分布式加载(parallel)已经支持了cpu+gloo后端的加载,gpu+nccl后端的加载仍然存在一些问题,后续会继续进行分析。 + + +相关仓库: + +- https://github.com/zeroRains/fastsafetensors 的paddle分支 + + +### 下周工作 + +1. 修复fastsafetensors兼容paddle时在gpu+nccl后端加载的问题 +2. 使用其他case确保fastsafetensors的兼容性 + +### 导师点评 + + +