Skip to content

Commit a3346ea

Browse files
committed
candle-flash-attn: remove duplicate softcap: f32 in run_mha FFI decl
The Rust extern declaration of `run_mha` in `candle-flash-attn/src/ffi.rs` listed `softcap: f32` twice — once between `softmax_scale` and `seqlen_q`, and again at the end of the parameter list (after `window_size_right`). The C definition in `candle-flash-attn/kernels/flash_api.cu` only has a single `softcap` (at the end), so the Rust-side signature declared 38 parameters while the call sites in `src/lib.rs` pass 37: error[E0061]: this function takes 38 arguments but 37 arguments were supplied --> candle-flash-attn/src/lib.rs:169:13 Drop the spurious copy between `softmax_scale` and `seqlen_q` so the Rust FFI matches the C ABI and compilation succeeds. Reproduced while building spiced (spiceai/spiceai #10278) with `--features cuda` on CUDA 12.6; the build now completes once this fix is applied.
1 parent 8315b19 commit a3346ea

1 file changed

Lines changed: 0 additions & 1 deletion

File tree

candle-flash-attn/src/ffi.rs

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ extern "C" {
3434
d: u32,
3535
d_rounded: u32,
3636
softmax_scale: f32,
37-
softcap: f32,
3837

3938
seqlen_q: u32,
4039
seqlen_k: u32,

0 commit comments

Comments
 (0)