-
Notifications
You must be signed in to change notification settings - Fork 659
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Buid time optimize (part2)
cla signed
fb-exported
meta-exported
#5000
opened Oct 13, 2025 by
gchalump
Loading…
Add Paged Attention to FMHA Cutlass Blackwell Forward kernel for fixed length
cla signed
fb-exported
meta-exported
#4999
opened Oct 13, 2025 by
sarithad-meta
Loading…
add monitroing metrics for dram cache perf -- metadata read & write
cla signed
fb-exported
meta-exported
#4996
opened Oct 10, 2025 by
kathyxuyy
Loading…
FP8 Convolution Kernel
cla signed
fb-exported
meta-exported
#4994
opened Oct 10, 2025 by
jwfromm
Loading…
support loading libraries when installing multiple FBGEMM targets
cla signed
fb-exported
meta-exported
#4993
opened Oct 10, 2025 by
q10
Loading…
Try adding device sync before attention. (#2008)
cla signed
fb-exported
meta-exported
#4992
opened Oct 9, 2025 by
jwfromm
Loading…
Adding python api to support sync trigger evict
cla signed
fb-exported
meta-exported
#4984
opened Oct 7, 2025 by
EddyLXJ
Loading…
: Add double type to be supported by permute_1D_sparse_data
cla signed
fb-exported
meta-exported
#4968
opened Oct 2, 2025 by
Shuchangd
Loading…
Build time optimiaztion
cla signed
fb-exported
meta-exported
#4954
opened Sep 30, 2025 by
gchalump
Loading…
Switch back to NVIDIA/cutlass, and upgrade to v4.2.1
cla signed
#4949
opened Sep 30, 2025 by
jasl
Loading…
Gate invalid triton autotune configs in AOTInductor for GFX95+
cla signed
fb-exported
meta-exported
#4940
opened Sep 26, 2025 by
JChunX
Loading…
Back out "Update to use Python 3.9 syntax"
cla signed
fb-exported
meta-exported
#4928
opened Sep 24, 2025 by
q10
Loading…
forward performance tuning for MI350
cla signed
module: rocm
#4925
opened Sep 24, 2025 by
liligwu
Loading…
more hipify v2 fixes (#4854)
cla signed
fb-exported
meta-exported
module: rocm
#4921
opened Sep 23, 2025 by
q10
Loading…
Support bf16 in blackwell cutlass decode attention kernel
cla signed
fb-exported
meta-exported
#4916
opened Sep 23, 2025 by
Aya-ZIbra
Loading…
Resolve wgrad grouped gemm relocation issue in fbgemm
cla signed
fb-exported
meta-exported
#4915
opened Sep 23, 2025 by
jiawenliu64
Loading…
- Clean torch.check
cla signed
fb-exported
meta-exported
#4871
opened Sep 12, 2025 by
flaviotruzzi
Loading…
dequantize_fp8_cache_kernel: Move D=128 device-side-assertion check to host
cla signed
fb-exported
meta-exported
#4869
opened Sep 12, 2025 by
ColinPeppler
Loading…
symmetric quantization to FBGEMM prefill token-wise FP8 (fixed)
cla signed
fb-exported
meta-exported
#4868
opened Sep 12, 2025 by
ColinPeppler
Loading…
- Reland D75563906
ci-no-td
cla signed
fb-exported
meta-exported
#4865
opened Sep 11, 2025 by
flaviotruzzi
Loading…
Migrate GenAI quantize kernels to
FBGEMM_LAUNCH_KERNEL
, pt 4
cla signed
fb-exported
#4863
opened Sep 11, 2025 by
q10
Loading…
Add cutlass decode kernel to TritonBench
cla signed
fb-exported
meta-exported
#4853
opened Sep 10, 2025 by
Aya-ZIbra
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-09-13.