-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Fix vocab_size inconsistency for sampler #2398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| revision=model_config.revision) | ||
| model_config.hf_config.sampler_vocab_size = min( | ||
| len(self.tokenizer), model_config.hf_config.vocab_size) | ||
| self.cache_config = cache_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines don't need to move right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could work, because self.model_config = model_config is a reference assignment.
Maybe it's better to move line 89-96 before self.model_config = model_config?
|
@tongyx361 What exact bug is this PR fixing? |
c.f. #340 (comment)
|
|
@zhuohan123 @simon-mo does this PR need more work? We are facing this issue in production |
|
I see that the added |
|
@zhuohan123 Are there any updates on a solution for this issue? Thanks |
|
I ended up adding extra tokens to the tokenizer to make it a multiple of 32 and it works |
|
Sorry for the delay. This PR is a little bit stalled. I will try to fix this in another PR with some other changes in the |
|
@zhuohan123 Okay thanks! What is the tracking number of this PR#? |
|
@creatorrr Okay thanks, I have in my case a qwen tokenizer of 151648 (151643 + 5 added tokens) that is a multiple of 32, a vocab.json size of 151643 and a vocab_size in config.json of 151936. Tried to add the 5 added tokens in vocab.json and in tokenizer vocab but still same error. |
### What this PR does / why we need it? vLLM-Ascend's rope implementaion include several header file that are not supposed to be included by outside users. Current implementation may break when canntoolkits update, this PR remove those not compatible file includes to guarantee the safety of upgrading cann toolkits. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested by rope unittest Signed-off-by: ganyi <[email protected]>
Fixing #340