Skip to content

Conversation

@Hzfengsy
Copy link
Member

@Hzfengsy Hzfengsy commented Mar 3, 2025

Fix the compilation error(mlc-ai/mlc-llm#3143) for Qwen2-1.5 models in the tree attention implementation for vulkan backend.

cc @spectrometerHBH @vinx13

Fix the compilation error for Qwen2-1.5 models in the tree attention
implementation for vulkan backend.
@Hzfengsy
Copy link
Member Author

Hzfengsy commented Mar 3, 2025

One additional note: this PR provides an immediate fix for the issue, but it doesn't address the underlying problem - the simplifier can potentially cause integer overflow. For illustration, here's a minimal reproducible example:

import tvm

x = tvm.tir.Var("x", "int32")
# Creating an expression that triggers integer overflow during simplification
expr = (tvm.tir.Div(x + 1073741826, 3) - 357913942) * 1536
ana = tvm.arith.Analyzer()
print(ana.simplify(expr))

cc @tqchen

Copy link
Contributor

@MasterJH5574 MasterJH5574 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@MasterJH5574 MasterJH5574 merged commit c286638 into apache:main Mar 3, 2025
16 checks passed
ShiboXing pushed a commit to ShiboXing/tvm that referenced this pull request Aug 10, 2025
Fix the compilation error for Qwen2-1.5 models in the tree attention
implementation for vulkan backend.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants