Skip to content

Pull requests: ROCm/FBGEMM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fwd optimizations
#125 opened Sep 23, 2025 by shbiswas834 Loading…
Remove fwd and warmup from benchmark profiling
#124 opened Sep 17, 2025 by huizzhan Draft
1 task
added malloc pitch on merged pool embedding
#123 opened Sep 11, 2025 by kudomcho Loading…
added optimized merged pool embedding script
#122 opened Sep 10, 2025 by kudomcho Loading…
apply unroll and prefetch optimization
#119 opened Aug 27, 2025 by zhiding512 Loading…
apply Vec4T on vbe forward
#118 opened Aug 27, 2025 by JaxChen29 Loading…
1 task
tuned grid size by reducing num_warps_per_threadblock to 4
#117 opened Aug 26, 2025 by kudomcho Loading…
1 task
apply Vec4T on vbe forward
#115 opened Aug 21, 2025 by JaxChen29 Loading…
1 task
warpReduction DPP version
#114 opened Aug 19, 2025 by Bernard-Liu Loading…
Meta28 optimization upstream
#113 opened Aug 18, 2025 by shbiswas834 Loading…
Upstream rocm guards
#111 opened Aug 15, 2025 by shbiswas834 Loading…
Abokovoi/mi350 bwd opt exp
#108 opened Aug 1, 2025 by kudomcho Loading…
added mi350 guards
#106 opened Jul 17, 2025 by shbiswas834 Loading…
Inf packed bag l
#104 opened Apr 21, 2025 by kudomcho Loading…
Inf opt packed bag l
#102 opened Apr 15, 2025 by kudomcho Loading…
added b ag packing L optimization
#100 opened Mar 19, 2025 by kudomcho Loading…
Abokovoi/inf packed bag tuning
#99 opened Mar 18, 2025 by kudomcho Loading…
Fix dense backward test
#92 opened Feb 18, 2025 by avbokovoy Loading…
added more warps per row on D dimension by 2X
#83 opened Jan 14, 2025 by kudomcho Loading…
Add warmup-ms argument to benchmark_requests
#78 opened Dec 17, 2024 by avbokovoy Loading…
ProTip! Follow long discussions with comments:>50.