Skip to content

Commit 805f29a

Browse files
[Feature] refactor metax_gpu attention and moe and remove some useless code (PaddlePaddle#3688)
Co-authored-by: yongqiangma <xing.wo@163.com>
1 parent cab7a63 commit 805f29a

File tree

5 files changed

+399
-299
lines changed

5 files changed

+399
-299
lines changed

fastdeploy/config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -894,7 +894,7 @@ def __init__(self, args):
894894
self.kv_cache_ratio = 1.0
895895
else:
896896
self.kv_cache_ratio = 0.75
897-
self.enc_dec_block_num = 0 if current_platform.is_iluvatar() else 2
897+
self.enc_dec_block_num = 0 if current_platform.is_iluvatar() or current_platform.is_maca() else 2
898898
self.prealloc_dec_block_slot_num_threshold = 12
899899
self.cache_dtype = "bfloat16"
900900
self.model_cfg = None

0 commit comments

Comments
 (0)