Skip to content

Commit d2ba1bc

Browse files
committed
Use causal_padding instead of padding
Signed-off-by: Reese Wang <[email protected]>
1 parent e3e785c commit d2ba1bc

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

praxis/contrib/gpu/scripts_gpu/te_helper.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,7 @@ def update_attn_te_tpl(te_tpl, attn_tpl):
208208
assert (transformer_layer_tpl.tr_fflayer_tpl.has_bias ==
209209
transformer_layer_tpl.tr_atten_tpl.use_bias), "TE only allows same bias settings."
210210
te_transformer_tpl.use_bias = transformer_layer_tpl.tr_fflayer_tpl.has_bias
211-
te_transformer_tpl.self_attn_mask_type = 'causal' \
211+
te_transformer_tpl.self_attn_mask_type = 'causal_padding' \
212212
if stacked_transformer_obj.mask_self_attention else 'padding'
213213

214214
te_transformer_tpl.logical_axes_rules = te_flax.extend_logical_axis_rules(tuple())

0 commit comments

Comments
 (0)