[Runtime][Builtin] Using float32 accumulation in attention kernel #16667

MasterJH5574 · 2024-03-02T21:59:11Z

Prior to this PR, the TIR attention kernels does not cast matmul operands to fp32 before multiplying.
For models like Phi-2 which may have large Q/K/V data (at the level of a few hundreds), the fp16 multiplication exceeds the range of fp16, and lead to attention result being NAN sometimes.

This PR fixes this issue.

Prior to this PR, the TIR attention kernels does not cast matmul operands to fp32 before multiplying. For models like Phi-2 which may have large Q/K/V data (at the level of a few hundreds), the fp16 multiplication exceeds the range of fp16, and lead to attention result being NAN sometimes. This PR fixes this issue.

MasterJH5574 · 2024-03-03T15:19:57Z

@tvm-bot rerun

…ache#16667) Prior to this PR, the TIR attention kernels does not cast matmul operands to fp32 before multiplying. For models like Phi-2 which may have large Q/K/V data (at the level of a few hundreds), the fp16 multiplication exceeds the range of fp16, and lead to attention result being NAN sometimes. This PR fixes this issue.

yongwww approved these changes Mar 4, 2024

View reviewed changes

yongwww merged commit ad1da4e into apache:main Mar 4, 2024

ysh329 mentioned this pull request Apr 21, 2024

[Release] v0.16.0 Release Candidate Notes #16911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Runtime][Builtin] Using float32 accumulation in attention kernel #16667

[Runtime][Builtin] Using float32 accumulation in attention kernel #16667

Uh oh!

MasterJH5574 commented Mar 2, 2024

Uh oh!

MasterJH5574 commented Mar 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Runtime][Builtin] Using float32 accumulation in attention kernel #16667

[Runtime][Builtin] Using float32 accumulation in attention kernel #16667

Uh oh!

Conversation

MasterJH5574 commented Mar 2, 2024

Uh oh!

MasterJH5574 commented Mar 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants