Replies: 6 comments 1 reply
-
run log (base) root@lf:~/ktransformers# cat start.sh
(kt) root@lf:~/ktransformers# bash start.sh |
Beta Was this translation helpful? Give feedback.
-
你这是编译的DEBUG 版本么? |
Beta Was this translation helpful? Give feedback.
-
您好,请问下您在海光dcu平台编译kt的时候有遇到过hipcc编译器报错“未指定nvidia还是amd平台”的这个错误吗,我看了编译命令,flag中有“-D__HIP_PLATFORM_AMD__=1”,但报错显示我并未指定该属性。相关日志如下: |
Beta Was this translation helpful? Give feedback.
-
没碰到类似情况,不过我编译的是0.23post1版本,是不是您编译的版本比较新?在发现性能不太令人满意之后我没有继续跟进这个硬件平台。
…________________________________
发件人: BrainCH ***@***.***>
发送时间: 2025年4月12日 23:47
收件人: kvcache-ai/ktransformers ***@***.***>
抄送: dahema ***@***.***>; Author ***@***.***>
主题: Re: [kvcache-ai/ktransformers] Hygon DCU K100AI get UP but very SLOW (Discussion #999)
您好,请问下您在海光dcu平台编译kt的时候有遇到过hipcc编译器报错“未指定nvidia还是amd平台”的这个错误吗,我看了编译命令,flag中有“-D__HIP_PLATFORM_AMD__=1”,但报错显示我并未指定该属性。相关日志如下:
/opt/dtk/bin/hipcc -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/TH -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/THC -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/THH -I/opt/dtk/include -I/root/miniconda3/envs/kt/include/python3.11 -c ktransformers/ktransformers_ext/hip/custom_gguf/dequant.hip -o /tmp/tmpaup26x2z.build-temp/ktransformers/ktransformers_ext/hip/custom_gguf/dequant.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -use_fast_math -Xcompiler -fPIC -DKTRANSFORMERS_USE_CUDA -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1014" -DTORCH_EXTENSION_NAME=KTransformersOps -D_GLIBCXX_USE_CXX11_ABI=1 --offload-arch=gfx906 --offload-arch=gfx926 --offload-arch=gfx928 --offload-arch=gfx936 -fno-gpu-rdc -std=c++17
In file included from ktransformers/ktransformers_ext/hip/custom_gguf/dequant.hip:12:
/opt/dtk/include/hip/hip_runtime.h:66:2: error: ("Must define exactly one of HIP_PLATFORM_AMD or HIP_PLATFORM_NVIDIA");
―
Reply to this email directly, view it on GitHub<#999 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADLPPACV2EK7W7T5PDT5BJT2ZEYPNAVCNFSM6AAAAAB2DLIEEWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEOBRGQ2DANA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
7975wx 32C
8x DDR5-5200 64GB
1x DCU K100AI
性能~2tps(对比4090D可以跑到10+)
…________________________________
发件人: BrainCH ***@***.***>
发送时间: 2025年4月16日 20:53
收件人: kvcache-ai/ktransformers ***@***.***>
抄送: dahema ***@***.***>; Author ***@***.***>
主题: Re: [kvcache-ai/ktransformers] Hygon DCU K100AI get UP but very SLOW (Discussion #999)
请问下您编译的是官网仓库还是south-ocean仓库的代码呀。以及请问下您设备的硬件配置与大模型性能具体是什么样的呢
―
Reply to this email directly, view it on GitHub<#999 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADLPPABSQAM4KRIOYKSKBDL2ZZHEZAVCNFSM6AAAAAB2DLIEEWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEOBVGQ2TMNQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
NVIDIA 30 系及之后有 marlin 算子支持, ROCm/HIP 和 DTK 暂无 marlin, 无法直接对比. |
Beta Was this translation helpful? Give feedback.
-
AMD Ryzen Threadripper PRO 7975WX 32-Cores
DDR5-5200 64GB*8
海光DCU K100AI 64GB @400W
能跑,但是速度非常慢,期待优化
GPU占用率保持100%
以下是我的安装(折腾)过程
总体基于doc/en/rocm.md
全新安装ubuntu 22.04 server
安装cuda-tool-kit 11.7,这一步不能省略,有的指南似乎有些问题
安装dtk
安装pytorch/torchvision/torchaudio等厂家DCU预编译包
手工查看requirements.txt,如果有厂家预编译的优先使用
如果编译报错
是缺少 google gflags和google glog尝试
最后编译完成
pip show ktransformers
尝试运行一下DeepSeek-R1-Q4模型
提示缺少一些包
pip install openai pytest
海光DCU使用--optimize_config_path ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml
可能碰到这个BUG https://github.com/kvcache-ai/ktransformers/issues/983,注释掉这一行
然后终于跑起来了
Beta Was this translation helpful? Give feedback.
All reactions