-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix: fix index out of bounds error in spec decoding #5954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fix index out of bounds error in spec decoding #5954
Conversation
|
/bot run |
|
PR_Github #11648 [ run ] triggered by Bot |
|
PR_Github #11648 [ run ] completed with state |
|
/bot run |
|
PR_Github #11663 [ run ] triggered by Bot |
|
PR_Github #11663 [ run ] completed with state |
|
/bot run |
|
PR_Github #11703 [ run ] triggered by Bot |
|
PR_Github #11703 [ run ] completed with state |
Signed-off-by: Fanrong Li <[email protected]>
Signed-off-by: Fanrong Li <[email protected]>
7d6518e to
937b34e
Compare
|
/bot run |
|
PR_Github #11723 [ run ] triggered by Bot |
|
PR_Github #11723 [ run ] completed with state |
|
/bot run |
|
PR_Github #11734 [ run ] triggered by Bot |
|
PR_Github #11734 [ run ] completed with state |
Signed-off-by: Wanli Jiang <[email protected]>
Signed-off-by: Shreyas Misra <[email protected]>
Signed-off-by: Ransiki Zhang <[email protected]>
Description
In warmup, all of the requests are with
self.is_dummy_request=True->self.is_dummy=True. So we will skip the input_ids/draft_tokens updates for those generation requests and just use the random values inself.input_ids_cudaandself.draft_tokens_cudaas inputs. If the random values are large, there will be "index out of bounds" errors in the embedding layer.