-
Notifications
You must be signed in to change notification settings - Fork 159
ADD Triron doc #534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
chang-wenbin
wants to merge
6
commits into
PaddlePaddle:master
Choose a base branch
from
chang-wenbin:Add_Paddle_triton
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
ADD Triron doc #534
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
7d63c38
uodate Triron doc
chang-wenbin 7bc0654
uodate Triron doc
chang-wenbin f53764a
uodate Triron doc
chang-wenbin 2061d75
update Triron doc
chang-wenbin a7580a5
update Triron doc
chang-wenbin 8f661d6
update Triron doc
chang-wenbin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
|
||
# Paddle Inference 支持Triton自定义算子使用方法 | ||
|
||
## 1. 相关背景 | ||
|
||
* Triton支持自定义算子,用户可以自己开发算子并注册到Triton中,在推理时调用。 | ||
* Triton 仅提供Python API,采用运行时JIT编译,即仅当用户代码的执行流执行到这个kernel的时候,这个kernel才真正的被编译。 | ||
|
||
## 2. Paddle Inference支持Triton自定义算子使用方法 | ||
|
||
Paddle Inference 提供了部分 `Norm`类和`date copy`类的融合算子,集成在paddlemix库中,用户可以按照需求使用这些算子。 | ||
Paddle Inference支持 `PaddleMix Triton`自定义算子使用方法如下: | ||
* 步骤1. 用户需要下载`PaddleMIX`库 | ||
```bash | ||
git clone https://github.com/PaddlePaddle/PaddleMIX.git | ||
``` | ||
* 步骤2. 用户需要安装Triton,并适配Paddle | ||
```bash | ||
python -m pip install triton | ||
python -m pip install git+https://github.com/zhoutianzi666/UseTritonInPaddle.git | ||
python -c "import use_triton_in_paddle; use_triton_in_paddle.make_triton_compatible_with_paddle()" | ||
``` | ||
* 步骤3. 在需要使用Triton算子的python文件中`import paddlemix` | ||
* 步骤4. 调用PaddleMix中对应的Triton算子API,实现高性能算子加速。 | ||
|
||
## 3. Paddle Inference 使用`PaddleMixTriton`自定义算子示例 | ||
使用Triton算子优化后代码: | ||
```py | ||
# 引入PaddleMIX | ||
import os | ||
os.sys.path.append(["/path/to/PaddleMIX"]) | ||
import paddlemix | ||
# 参数准备 | ||
emb = self.linear(self.silu(conditioning_embedding).cast(x.dtype)) | ||
scale, shift = paddle.chunk(emb, 2, axis=1) | ||
|
||
# Triton API :adaptive_layer_norm | ||
x = paddlemix.triton_ops.adaptive_layer_norm( | ||
x, scale_msa, shift_msa, self.norm.weight, self.norm.bias, epsilon=1e-06) | ||
``` | ||
|
||
优化前代码: | ||
```py | ||
# 参数准备 | ||
emb = self.linear(self.silu(conditioning_embedding).cast(x.dtype)) | ||
scale, shift = paddle.chunk(emb, 2, axis=1) | ||
norm_elementwise_affine_kwargs = dict(weight_attr=False, bias_attr=False) | ||
|
||
# 原低效的算子实现 | ||
self.norm = nn.LayerNorm(embedding_dim, epsilon=1e-6, **norm_elementwise_affine_kwargs) | ||
x = self.norm(x) * (1 + scale_msa[:, None]) + shift_msa[:, None] | ||
``` | ||
|
||
|
||
## 4. 注意事项 | ||
* 1.用户需要注意参数顺序Triton API中规定的参数,以及参数的填充顺序。 | ||
* 2.用户需要注意Triton API中参数的默认值与需要填充的参数的是否一致。例如权重和偏置的默认值是否为None。 | ||
给出Triton API中参数的默认值和需要填充的参数的示例如下: | ||
|
||
```py | ||
adaptive_layer_norm(x, scale, shift, weight=None, bias=None, epsilon=1e-05) | ||
|
||
fused_adaLN_scale_residual(x, mha_out, gate_msa, scale_mlp, shift_mlp, weight=None, bias=None, epsilon=1e-05) | ||
|
||
split_concat(x, y) | ||
|
||
triton_split(x, num_or_sections=[-1, -1], axis=1) | ||
``` | ||
|
||
## Q&A | ||
* `triton Error [CUDA]: device kernel image is invalid\no exit` | ||
使用Triton过程中,如果遇到上述问题,请参考如下解决方案: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 用户triton中的ptxas准确位置,可以用这个命令去找find /your_triton_package_path -name ptxas |
||
|
||
```bash | ||
cp /your_cuda_path/bin/ptxas /your_python_path/site-packages/triton/backends/nvidia/bin/ | ||
# 将triton安装包中的ptxas,替换为 /your_cuda_path/bin/ptxas | ||
# 不同Triton版本的patxas路径可能不同;用户triton中的ptxas准确位置,可以用如下命令寻找: | ||
# find /your_triton_package_path -name ptxas | ||
# 此方案在Triton 2.3.0 & 3.0.0版本中有效,其他版本的Triton尚未验证。 | ||
``` | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
export PYTHONPATH=/path/to/PaddleMIX/:$PYTHONPATH