[CUTLASS] Add FP8 gemm kernels #17408

MasterJH5574 · 2024-09-23T20:44:59Z

This PR introduces the sm90a FP8 kernels from CUTLASS. These kernels are helpful in the cases of small M, where cuBLAS has unoptimized performance.

This PR introduces the sm90a FP8 kernels from CUTLASS. These kernels are helpful in the cases of small `M`, where cuBLAS has unoptimized performance.

tqchen approved these changes Sep 23, 2024

View reviewed changes

[CUTLASS] Add FP8 gemm kernels

6972e95

This PR introduces the sm90a FP8 kernels from CUTLASS. These kernels are helpful in the cases of small `M`, where cuBLAS has unoptimized performance.

MasterJH5574 force-pushed the tvm-dev/2024-09-23-cutlass-fp8-gemm branch from eae6c96 to 6972e95 Compare September 24, 2024 19:25

Hzfengsy merged commit 4e70e4a into apache:main Sep 25, 2024
4 of 5 checks passed

ysh329 mentioned this pull request Oct 16, 2024

[Release] v0.18.0 Release Candidate Notes #17468

Closed

kurisu6912 mentioned this pull request Sep 5, 2025

kurisu add assume attr patch 1 tile-ai/tvm#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUTLASS] Add FP8 gemm kernels #17408

[CUTLASS] Add FP8 gemm kernels #17408

Uh oh!

MasterJH5574 commented Sep 23, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[CUTLASS] Add FP8 gemm kernels #17408

[CUTLASS] Add FP8 gemm kernels #17408

Uh oh!

Conversation

MasterJH5574 commented Sep 23, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants