-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
fix: kimi_k2 return empty tool call list #22149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The code changes modify the regex used to parse tool call IDs to allow hyphens. The original regex [\w\.] did not match hyphens, which caused issues when tool IDs contained them. The new regex .+ is very broad and could potentially match more than intended, especially with the .*? in the function_arguments group. It might be beneficial to have a more specific character class or a negative character class to avoid unintended matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex .+ is very broad and could potentially match more than intended, especially with the .*? in the function_arguments group. It might be beneficial to have a more specific character class or a negative character class to avoid unintended matches. Consider what characters are actually expected in the tool_call_id and refine the regex accordingly. This could prevent unexpected behavior if the input string deviates from the expected format. Also, consider adding a check to ensure that the tool_call_id does not contain any whitespace characters, as this could lead to parsing errors. If whitespace is allowed, ensure that it is handled correctly in subsequent processing steps.
For example, if the tool ID is expected to be alphanumeric with hyphens and underscores, the regex could be refined to [a-zA-Z0-9\-_]+.
r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[a-zA-Z0-9\-_]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>"There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous comment, the regex .+ in stream_tool_call_portion_regex is very broad. Refining this regex to match only expected characters in tool_call_id would improve robustness and prevent unintended matches during streaming. Consider the expected format and characters for the tool_call_id and adjust the regex accordingly. Also, consider adding a check to ensure that the tool_call_id does not contain any whitespace characters, as this could lead to parsing errors. If whitespace is allowed, ensure that it is handled correctly in subsequent processing steps.
For example, if the tool ID is expected to be alphanumeric with hyphens and underscores, the regex could be refined to [a-zA-Z0-9\-_]+.
r"(?P<tool_call_id>[a-zA-Z0-9\-_]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*)"There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex .+ in stream_tool_call_name_regex is very broad. Refining this regex to match only expected characters in tool_call_id would improve robustness and prevent unintended matches during streaming. Consider the expected format and characters for the tool_call_id and adjust the regex accordingly. Also, consider adding a check to ensure that the tool_call_id does not contain any whitespace characters, as this could lead to parsing errors. If whitespace is allowed, ensure that it is handled correctly in subsequent processing steps.
For example, if the tool ID is expected to be alphanumeric with hyphens and underscores, the regex could be refined to [a-zA-Z0-9\-_]+.
r"(?P<tool_call_id>[a-zA-Z0-9\-_]+:\d+)\s*"Signed-off-by: tlipoca9 <[email protected]>
Head branch was pushed to by a user without write access
|
@aarnphm please merge, i fix the DCO checks in ci |
Signed-off-by: tlipoca9 <[email protected]>
Signed-off-by: tlipoca9 <[email protected]>
Signed-off-by: tlipoca9 <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>
Signed-off-by: tlipoca9 <[email protected]> Signed-off-by: Noam Gat <[email protected]>
Signed-off-by: tlipoca9 <[email protected]> Signed-off-by: Paul Pak <[email protected]>
Signed-off-by: tlipoca9 <[email protected]> Signed-off-by: Diego-Castan <[email protected]>
Signed-off-by: tlipoca9 <[email protected]>
Signed-off-by: tlipoca9 <[email protected]> Signed-off-by: Xiao Yu <[email protected]>
Signed-off-by: tlipoca9 <[email protected]>
Signed-off-by: tlipoca9 <[email protected]>
Signed-off-by: tlipoca9 <[email protected]>
Purpose
if tool id contains '-', the original regex '[\w\.]' cannot match it.