-
-
Notifications
You must be signed in to change notification settings - Fork 10.7k
[Bugfix]: do not shutdown server if skip_special_use=False
for MistralTokenizer
#14094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix]: do not shutdown server if skip_special_use=False
for MistralTokenizer
#14094
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
40442a9
to
771b2ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, would be great if @patrickvonplaten could check too.
The alternative I guess is to add request validation for this (e.g. using something like a supports_skip_special_tokens()
method on the tokenizers), to fail individual requests.
skip_special_use=False
for MistralTokenizewrskip_special_use=False
for MistralTokenizer
skip_special_use=False
for MistralTokenizerskip_special_use=False
for MistralTokenizer
@njhill would you prefer the |
Signed-off-by: Guillaume Calmettes <[email protected]>
d3295ab
to
e6cf7a9
Compare
@njhill I actually made the change, as it results in a better user experience: the individual requests are failed and the user received a proper error (we already catch the Since we already perform some custom actions in the input preprocessing steps when the MistralTokenizer is used (tool serialization, tool call id truncation), I just added another method to validate the request params. |
cc @DarkLight1337 , another mistral-related fix. Note that this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable to me as well!
CI tests errors (unrelated)
|
…ralTokenizer (vllm-project#14094) Signed-off-by: Guillaume Calmettes <[email protected]> Signed-off-by: Yang Wang <[email protected]>
…ralTokenizer (vllm-project#14094) Signed-off-by: Guillaume Calmettes <[email protected]>
…ralTokenizer (vllm-project#14094) Signed-off-by: Guillaume Calmettes <[email protected]>
…ralTokenizer (vllm-project#14094) Signed-off-by: Guillaume Calmettes <[email protected]> Signed-off-by: Mu Huai <[email protected]>
The
MistralTokenizer
does not currently support theskip_special_tokens=False
argument.Because an
assert
is used to check for theskip_special_tokens
value, if one makes a requests passing the"skip_special_tokens": false
parameter in the payload, the assert will be raised, triggering the shutdown of the vllm engine.We should probably not crash the full engine due to a single client's argument.
This PR validates the request when the
MistralTokenizer
is used to ensure that theskip_special_tokens
parameter is not set toFalse
in the request, otherwise an error is sent back to the client.cc: @patrickvonplaten
Reproduce the error:
skip_special_tokens
parameter tofalse
Which leads to a shutdown of the server: