-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
[Frontend] [gpt-oss] Chat format GD for tool calling with gptoss #28148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec Solder <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request significantly enhances the tool-calling capabilities for gpt-oss models by introducing a robust, schema-driven guided decoding mechanism for the chat format. The changes are well-structured, separating concerns effectively across different modules. The replacement of the mock-based ToolServer interaction with a more direct tool_names approach in the reasoning parser is a clean refactoring. The addition of comprehensive unit tests and, especially, the new end-to-end tests, provides strong confidence in the correctness and robustness of this new implementation. I have one high-severity comment regarding a mismatch between the production logic and unit tests for the python tool, which should be addressed to ensure full test coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request significantly enhances the tool calling capabilities for gpt-oss models by introducing a more robust structural tag schema for guided decoding. The refactoring to decouple the reasoning parser from the ToolServer and instead rely on a list of tool names is a solid architectural improvement, making the system more modular and easier to test. The new unit and end-to-end tests, especially those covering edge cases with tool names, are comprehensive and greatly increase confidence in the changes. I've identified one critical issue in how tool names are generated, which I've detailed in a specific comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request significantly refactors the gpt-oss chat format generation for tool calling by introducing a more robust structural tags schema. The changes decouple the reasoning parser from the ToolServer and instead derive tool information directly from the request messages, which is a great design improvement. The addition of comprehensive unit and end-to-end tests, especially for edge cases, greatly increases confidence in this new implementation.
I have found one high-severity issue in the new get_tool_names_from_messages utility function. The logic for constructing tool names appears to be incorrect for tools that do not have a sub-name within their namespace (e.g., the built-in python tool), which could lead to incorrect guided decoding and tool call failures. A code suggestion has been provided to fix this.
Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec Solder <[email protected]>
Signed-off-by: Alec Solder <[email protected]>
| logger = init_logger(__name__) | ||
|
|
||
| no_func_reaonsing_tag = { | ||
| TRIGGERS = ["<|channel|>", "<|start|>assistant"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder if we can define this as some sort of yaml or json files but we enabled default values for these tags. this allows people to modify their template without changing vllm's binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Maybe it is best that we have a default_template and load it in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be the default template but only if it is at least neutral to general eval tests. Otherwise, people may question about it.
Also, double down on the suggestion from Charlotte if we have the flexibility of passing a json file.
|
could you also summarize the main behavior change? we were missing following structural tag following in final messages? |
|
As I understand now, this does not require input of a function tag from outside file and users only need to input whether they want to turn on this feature. If yes, the code looks good to me. |
Purpose
Building on top of the initial gpt-oss reasoning parser in #25515, this PR fleshes out the full structural tags schema to guide the chat format for gpt-oss.
It is only added to the Responses API path for now, but it should be able to fix issues like the ones mentioned in #24954 without needing to make adjustments to Harmony.
Guided decoding for the gpt-oss chat format should only be enabled if
structured_outputs_config.enable_in_reasoningis True, since the chat format is technically in the reasoning section of model output. It also lets us continue to use structured outputs for the content of the final message.structured_outputs_config.reasoning_parseris set by default for gpt-oss.TODO: test that structured output works alongside this after #28000 is merged, and adjust for tool calling
Test Plan
New unit tests and e2e tests with difficult tool names.
To enable GD of the chat format
Tested with a curl like
And confirming it still works with built in tools
With a curl like
Test Result
Valid output for both cases
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.