Skip to content

Conversation

Shixiaowei02
Copy link
Collaborator

No description provided.

@juney-nvidia juney-nvidia merged this pull request into release/0.5.0 Oct 18, 2023
@sjtu-cz sjtu-cz mentioned this pull request Nov 7, 2023
@poweiw poweiw added invalid and removed invalid labels Jun 3, 2025
greg-kwasniewski1 pushed a commit to greg-kwasniewski1/TensorRT-LLM that referenced this pull request Jun 10, 2025
…DIA#7)

* example of inductor pattern matcher for RoPE with explicit cos/sin matcher

Signed-off-by: Frida Hou <[email protected]>

* move to utils

Signed-off-by: Frida Hou <[email protected]>

* add usage of scalar_workaround, support op_ignore_type

Signed-off-by: Ubuntu <[email protected]>

* minor

Signed-off-by: Ubuntu <[email protected]>

* update all 3 types of RoPE matcher to use inductor pattern matcher

Signed-off-by: Frida Hou <[email protected]>

* address feedback and refine code/doc

Signed-off-by: Frida Hou <[email protected]>

* minor

Signed-off-by: Ubuntu <[email protected]>

* fix 2e2 for llama4 and ds rope, remove legalize_graph in canonicalize_graph, update ds rope impl to match with the exported graph

Signed-off-by: Frida Hou <[email protected]>

* deprecate previous rope matcher

Signed-off-by: Ubuntu <[email protected]>

---------

Signed-off-by: Frida Hou <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
danielafrimi added a commit to danielafrimi/TensorRT-LLM that referenced this pull request Jun 30, 2025
# This is the 1st commit message:

kernel

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

remove prints

Signed-off-by: Ubuntu <[email protected]>

test pass

Signed-off-by: Ubuntu <[email protected]>

test refactor with more use cases

Signed-off-by: Ubuntu <[email protected]>

refacor

Signed-off-by: Ubuntu <[email protected]>

refacor_2

Signed-off-by: Ubuntu <[email protected]>

add tuner wip

Signed-off-by: Ubuntu <[email protected]>

autotuner works

Signed-off-by: Ubuntu <[email protected]>

bfloat16 works. moer changes to the thop file

Signed-off-by: Ubuntu <[email protected]>

is tune for autotuner is True --> gets real tactics configs

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

zeros + quant mode is works

Signed-off-by: Ubuntu <[email protected]>

act int8

Signed-off-by: Ubuntu <[email protected]>

removed fp8 for now

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

w4a16 linear module

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

changed cutalss for sm==89

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

test linear work

Signed-off-by: Ubuntu <[email protected]>

add license

Signed-off-by: Ubuntu <[email protected]>

works!

Signed-off-by: Ubuntu <[email protected]>

refactor + linear test pass

Signed-off-by: Ubuntu <[email protected]>

preprocess in load weights

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

refactor + rebase

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

Blackwell not supported

Signed-off-by: Daniel Afrimi <[email protected]>

wip

Signed-off-by: Daniel Afrimi <[email protected]>

skip blackwell

Signed-off-by: Daniel Afrimi <[email protected]>

wip

Signed-off-by: Daniel Afrimi <[email protected]>

works

Signed-off-by: Ubuntu <[email protected]>

# This is the commit message NVIDIA#2:

rebased

Signed-off-by: Ubuntu <[email protected]>

# This is the commit message NVIDIA#3:

align with my pld worked version of linear

Signed-off-by: Ubuntu <[email protected]>

# This is the commit message NVIDIA#4:

wip

Signed-off-by: Ubuntu <[email protected]>

# This is the commit message NVIDIA#5:

refactor

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#6:

refactor

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#7:

refactor

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#8:

refactor

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#9:

sys path

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#10:

sys path

Signed-off-by: Daniel Afrimi <[email protected]>
litaotju pushed a commit to litaotju/TensorRT-LLM that referenced this pull request Jul 19, 2025
litaotju pushed a commit to litaotju/TensorRT-LLM that referenced this pull request Jul 24, 2025
Signed-off-by: Yuxian Qiu <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
yuxianq added a commit to yuxianq/TensorRT-LLM that referenced this pull request Jul 28, 2025
Signed-off-by: Yuxian Qiu <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
zongfeijing pushed a commit to zongfeijing/TensorRT-LLM that referenced this pull request Jul 31, 2025
Signed-off-by: Yuxian Qiu <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants