[CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul #16892

ibsidorenko · 2024-04-16T15:35:14Z

This commit replaces fp16 compute dtype and scale dtype by fp32 in cublas matmul.

According to cuBLAS docs there are two possible options for compute/scale dtype when input/output dtype is fp16:

compute dtype is fp16 and scale dtype is fp16
compute dtype is fp32 and scale dtype is fp32

By default, we use 1) in apache/tvm and 2) in octoml/tvm. This commit aligns different behaviour and set fp32 as default.

cc @vinx13 @masahi

This commit replaces fp16 compute dtype and scale dtype by fp32 in cublas matmul.

[CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul

d845ba6

This commit replaces fp16 compute dtype and scale dtype by fp32 in cublas matmul.

github-actions bot requested review from masahi and vinx13 April 16, 2024 15:35

vinx13 approved these changes Apr 16, 2024

View reviewed changes

vinx13 merged commit 08965f0 into apache:main Apr 16, 2024

ibsidorenko deleted the cublas-fp32-compute-dtype branch April 17, 2024 08:05

ysh329 mentioned this pull request Jul 20, 2024

[Release] v0.17.0 Release Candidate Notes #17178

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul #16892

[CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul #16892

Uh oh!

ibsidorenko commented Apr 16, 2024 •

edited by tqchen

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul #16892

[CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul #16892

Uh oh!

Conversation

ibsidorenko commented Apr 16, 2024 • edited by tqchen Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ibsidorenko commented Apr 16, 2024 •

edited by tqchen

Loading