Skip to content

Conversation

@lfr-0531
Copy link
Collaborator

Description

In warmup, all of the requests are with self.is_dummy_request=True -> self.is_dummy=True. So we will skip the input_ids/draft_tokens updates for those generation requests and just use the random values in self.input_ids_cuda and self.draft_tokens_cuda as inputs. If the random values are large, there will be "index out of bounds" errors in the embedding layer.

@lfr-0531 lfr-0531 requested a review from QiJune July 11, 2025 08:29
@lfr-0531 lfr-0531 requested a review from a team as a code owner July 11, 2025 08:29
@lfr-0531
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11648 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11648 [ run ] completed with state FAILURE

@lfr-0531
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11663 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11663 [ run ] completed with state FAILURE
/LLM/release-0.21/L0_MergeRequest_PR pipeline #232 completed with status: 'FAILURE'

@lfr-0531
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11703 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11703 [ run ] completed with state FAILURE
/LLM/release-0.21/L0_MergeRequest_PR pipeline #238 completed with status: 'FAILURE'

@lfr-0531 lfr-0531 force-pushed the user/fanrongl/fix_spec_dec_index branch from 7d6518e to 937b34e Compare July 13, 2025 09:59
@lfr-0531
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11723 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11723 [ run ] completed with state SUCCESS
/LLM/release-0.21/L0_MergeRequest_PR pipeline #239 completed with status: 'FAILURE'

@lfr-0531
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11734 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11734 [ run ] completed with state SUCCESS
/LLM/release-0.21/L0_MergeRequest_PR pipeline #240 completed with status: 'SUCCESS'

@lfr-0531 lfr-0531 merged commit bed78a2 into NVIDIA:release/0.21 Jul 14, 2025
3 checks passed
Wanli-Jiang pushed a commit to Wanli-Jiang/TensorRT-LLM that referenced this pull request Jul 17, 2025
dc3671 pushed a commit to dc3671/TensorRT-LLM that referenced this pull request Jul 21, 2025
dc3671 pushed a commit to dc3671/TensorRT-LLM that referenced this pull request Jul 22, 2025
NVShreyas pushed a commit to NVShreyas/TensorRT-LLM that referenced this pull request Jul 28, 2025
@lfr-0531 lfr-0531 deleted the user/fanrongl/fix_spec_dec_index branch July 29, 2025 01:27
Ransiki pushed a commit to Ransiki/TensorRT-LLM that referenced this pull request Jul 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants