Skip to content

Conversation

JasonZhu1313
Copy link
Contributor

@JasonZhu1313 JasonZhu1313 commented Mar 27, 2025

PR Description:

This PR refactors the error handling code to catch multiple exceptions (ValueError, TypeError, jinja2.TemplateError) in a single except block, reducing code repetition and improving readability. The same logging and error response logic is applied to all three exception types.

Benefits:

  • Cleaner and more maintainable code.
  • Easier to extend by adding additional exception types in the future.
  • Improved readability without changing the existing behavior.

Test:

Error trace is caught properly and shown in server live log when I passed the wrong model that didn't match the --tool-call-parser hermes:

vllm serve llama_model_path --enable-auto-tool-choice --tool-call-parser hermes

INFO:     Application startup complete.
INFO 03-27 22:51:56 chat_utils.py:332] Detected the chat template content format to be 'string'. You can set `--chat-template-content-format` to override this.
ERROR 03-27 22:51:56 serving_chat.py:195] Error in preprocessing prompt inputs
ERROR 03-27 22:51:56 serving_chat.py:195] Traceback (most recent call last):
ERROR 03-27 22:51:56 serving_chat.py:195]   File "/miniconda3/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_chat.py", line 179, in create_chat_completion
ERROR 03-27 22:51:56 serving_chat.py:195]     ) = await self._preprocess_chat(
ERROR 03-27 22:51:56 serving_chat.py:195]   File "/miniconda3//lib/python3.10/site-packages/vllm/entrypoints/openai/serving_engine.py", line 434, in _preprocess_chat
ERROR 03-27 22:51:56 serving_chat.py:195]     request = tool_parser(tokenizer).adjust_request(  # type: ignore
ERROR 03-27 22:51:56 serving_chat.py:195]   File "/miniconda3/lib/python3.10/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 61, in __init__
ERROR 03-27 22:51:56 serving_chat.py:195]     raise RuntimeError(
ERROR 03-27 22:51:56 serving_chat.py:195] RuntimeError: Hermes 2 Pro Tool parser could not locate tool call start/end tokens in the tokenizer!
INFO:     127.0.0.1:57814 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added the frontend label Mar 27, 2025
Signed-off-by: JasonZhu1313 <[email protected]>
Copy link
Collaborator

@ProExpertProg ProExpertProg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup!

Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good. It's clearly reducing duplication.

This touches on a larger needed cleanup here. As you can see from the duplication you removed, we map everything to the same HTTP response (400). That's not the best in all cases. We should be mapping different failures to different HTTP responses.

For example, I thinkRuntimeError that's caught in one part of this change should be a 500 response instead of the current 400.

I'm just sharing in case you'd be interested in further work in this area! Thanks for the patch.

@russellb russellb added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 28, 2025
@russellb russellb enabled auto-merge (squash) March 28, 2025 00:40
@russellb russellb merged commit cec8c7d into vllm-project:main Mar 28, 2025
40 of 41 checks passed
Alex4210987 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Apr 5, 2025
lulmer pushed a commit to lulmer/vllm that referenced this pull request Apr 7, 2025
lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025
shreyankg pushed a commit to shreyankg/vllm that referenced this pull request May 3, 2025
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants