Commit 082aa9a

authored and

committed

Boost performance of MXFP4 quantization with inline PTX (#4694)

Summary: Pull Request resolved: #4694 X-link: facebookresearch/FBGEMM#1720 - Add inline PTX to boost MXFP4 quantization kernel performance - Fix MXFP4 scaling factor in grouped GEMM Differential Revision: D80182398

1 parent 08a4c45 commit 082aa9aCopy full SHA for 082aa9a

3 files changed

+161

-184

lines changed

fbgemm_gpu/experimental
- gemm/triton_gemm
  - fp4_quantize.py
- gen_ai
  - bench
    - quantize_ops.py
  - src/quantize/cutlass_extensions/f4f4bf16_grouped
    - f4f4bf16_grouped_common.cuh

3 files changed

+161

-184

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 082aa9a

3 files changed

3 files changed

File tree

3 files changed

3 files changed

0 commit comments