-
Notifications
You must be signed in to change notification settings - Fork 550
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
update the sorting kernel for bf16 ck fmoe kernel
cla signed
fb-exported
#3817
opened Mar 14, 2025 by
sijiac
Loading…
[CUTLASS] Roll cutlass version back a bit to hopefully fix compilation errors.
cla signed
#3816
opened Mar 14, 2025 by
jwfromm
Loading…
Cleanups for the EEG-based TBE benchmark CLI, pt 2
cla signed
fb-exported
#3815
opened Mar 13, 2025 by
q10
Loading…
Fuse cumsum into stacked FP8 Grouped Gemm
cla signed
fb-exported
#3814
opened Mar 13, 2025 by
jwfromm
Loading…
Add stacked version of grouped gemm to quantize bench
cla signed
fb-exported
#3813
opened Mar 13, 2025 by
jwfromm
Loading…
Fuse cumulative sum into FP8xINT4 Grouped Gemm
cla signed
fb-exported
#3812
opened Mar 13, 2025 by
jwfromm
Loading…
test fp8fp8bf16/bf16fp8bf16_fast_gemv is torch compileable
cla signed
fb-exported
#3809
opened Mar 13, 2025 by
YUNQIUGUO
Loading…
Reduce bulk init time and fix OOM (#892)
cla signed
fb-exported
#3808
opened Mar 13, 2025 by
peterfu0
Loading…
Add Preshuffled FP8 x INT4 Grouped Gemm Kernel
cla signed
fb-exported
#3800
opened Mar 11, 2025 by
jwfromm
Loading…
Fast BF16 Reduction for AllReduce
cla signed
fb-exported
#3793
opened Mar 10, 2025 by
zjing14
Loading…
Performance Optimization: Improved TileShape Configuration for Large Llama Shapes
cla signed
#3790
opened Mar 10, 2025 by
MatrixAssembler
Loading…
Enable FP8 Triton dequantized block-wise kernel
cla signed
fb-exported
#3788
opened Mar 10, 2025 by
jiawenliu64
Loading…
Fix TBE inference per-sample weight
cla signed
fb-exported
#3787
opened Mar 10, 2025 by
sryap
Loading…
Handle zero inputs in F8I4 GEMM
cla signed
fb-exported
#3777
opened Mar 6, 2025 by
jiawenliu64
Loading…
Provide helper functions for int4 quantization
cla signed
fb-exported
#3775
opened Mar 6, 2025 by
jwfromm
Loading…
FBGEMM Add Columnwise Weight Scaling to F8I4 GEMM
cla signed
fb-exported
#3766
opened Mar 5, 2025 by
jwfromm
Loading…
fix: topology_utils.h compilation issue
cla signed
#3761
opened Mar 4, 2025 by
DevashishLal-CB
Loading…
Move float conversion functions from Types.h into new FloatConversion.h
cla signed
fb-exported
#3760
opened Mar 4, 2025 by
MatzeB
Loading…
Invoke CPU schema op for unweighted version TBE on MTIA
cla signed
fb-exported
#3757
opened Mar 3, 2025 by
egienvalue
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.