Skip to content

Conversation

kaiyux
Copy link
Member

@kaiyux kaiyux commented Sep 21, 2023

No description provided.

@kaiyux kaiyux self-assigned this Sep 21, 2023
@juney-nvidia
Copy link
Collaborator

LGTM, thanks for the quick fix.

@juney-nvidia juney-nvidia merged commit 9b563ba into main Sep 21, 2023
@kaiyux kaiyux deleted the kaiyu/add_static_libraries branch September 21, 2023 03:52
liuyhwangyh pushed a commit to liuyhwangyh/TensorRT-LLM that referenced this pull request Mar 21, 2024
# This is the 1st commit message:

add download models form www.modelscope.cn

# This is the commit message NVIDIA#2:

debug

# This is the commit message NVIDIA#3:

debug
yingcanw added a commit that referenced this pull request Jan 2, 2025
* Fix model name mapping (#2)
nv-guomingz pushed a commit that referenced this pull request Jan 24, 2025
* Add README

* Add unified converter (#1)

* init v3 lite feat

* fix moe topk method

* fix noaux_tc logic

* fix deepseek v3 normal rope

* refactor

* wo conversion ok debugging build

* add quantize for attn.dense

* add unified converter support

* testing unified converter

* add convert checkpoint and update docs

---------

Co-authored-by: Zeyu Wang <[email protected]>

* update README

* add FP8 notes

* Update run.py result

* Update V3 README

* Update usages of FP8 to BF16 instruction

* fix model name mapping (#2)

* Update HF ckpt BF16 conversion.

* fix config of deepseek kv cache

* Remove source code

* Deepseek V3 FP8 Support

---------

Co-authored-by: jershi425 <[email protected]>
Co-authored-by: Zeyu Wang <[email protected]>
Co-authored-by: Hanyue He <[email protected]>
Co-authored-by: root <[email protected]>
dongxuy04 added a commit to dongxuy04/TensorRT-LLM that referenced this pull request Apr 25, 2025
dongxuy04 added a commit that referenced this pull request Apr 25, 2025
* add MNNVL memory mapping support

Signed-off-by: Dongxu Yang <[email protected]>

* add more MPI environment for trtllm-llmapi-launch

Signed-off-by: Dongxu Yang <[email protected]>

* add MoE communication and prepare kernels

Signed-off-by: Dongxu Yang <[email protected]>

* add MNNVL AlltoAll support for DeepSeekV3

Signed-off-by: Dongxu Yang <[email protected]>

* add output dump for throughput benchmark

Signed-off-by: Dongxu Yang <[email protected]>

* support dynamic kernel launch grid

Signed-off-by: Dongxu Yang <[email protected]>

* address review comments

Signed-off-by: Dongxu Yang <[email protected]>

* address review comments #2

Signed-off-by: Dongxu Yang <[email protected]>

---------

Signed-off-by: Dongxu Yang <[email protected]>
wu1du2 pushed a commit to wu1du2/TensorRT-LLM that referenced this pull request May 11, 2025
danielafrimi added a commit to danielafrimi/TensorRT-LLM that referenced this pull request Jun 30, 2025
# This is the 1st commit message:

kernel

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

remove prints

Signed-off-by: Ubuntu <[email protected]>

test pass

Signed-off-by: Ubuntu <[email protected]>

test refactor with more use cases

Signed-off-by: Ubuntu <[email protected]>

refacor

Signed-off-by: Ubuntu <[email protected]>

refacor_2

Signed-off-by: Ubuntu <[email protected]>

add tuner wip

Signed-off-by: Ubuntu <[email protected]>

autotuner works

Signed-off-by: Ubuntu <[email protected]>

bfloat16 works. moer changes to the thop file

Signed-off-by: Ubuntu <[email protected]>

is tune for autotuner is True --> gets real tactics configs

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

zeros + quant mode is works

Signed-off-by: Ubuntu <[email protected]>

act int8

Signed-off-by: Ubuntu <[email protected]>

removed fp8 for now

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

w4a16 linear module

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

changed cutalss for sm==89

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

test linear work

Signed-off-by: Ubuntu <[email protected]>

add license

Signed-off-by: Ubuntu <[email protected]>

works!

Signed-off-by: Ubuntu <[email protected]>

refactor + linear test pass

Signed-off-by: Ubuntu <[email protected]>

preprocess in load weights

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

refactor + rebase

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

Blackwell not supported

Signed-off-by: Daniel Afrimi <[email protected]>

wip

Signed-off-by: Daniel Afrimi <[email protected]>

skip blackwell

Signed-off-by: Daniel Afrimi <[email protected]>

wip

Signed-off-by: Daniel Afrimi <[email protected]>

works

Signed-off-by: Ubuntu <[email protected]>

# This is the commit message NVIDIA#2:

rebased

Signed-off-by: Ubuntu <[email protected]>

# This is the commit message NVIDIA#3:

align with my pld worked version of linear

Signed-off-by: Ubuntu <[email protected]>

# This is the commit message NVIDIA#4:

wip

Signed-off-by: Ubuntu <[email protected]>

# This is the commit message NVIDIA#5:

refactor

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#6:

refactor

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#7:

refactor

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#8:

refactor

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#9:

sys path

Signed-off-by: Daniel Afrimi <[email protected]>

# This is the commit message NVIDIA#10:

sys path

Signed-off-by: Daniel Afrimi <[email protected]>
yuxianq added a commit to yuxianq/TensorRT-LLM that referenced this pull request Jul 17, 2025
litaotju pushed a commit to litaotju/TensorRT-LLM that referenced this pull request Jul 24, 2025
yuxianq added a commit to yuxianq/TensorRT-LLM that referenced this pull request Jul 28, 2025
zongfeijing pushed a commit to zongfeijing/TensorRT-LLM that referenced this pull request Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants