Skip to content

Conversation

@DarkLight1337
Copy link
Member

@DarkLight1337 DarkLight1337 commented Oct 8, 2025

Purpose

FIX #26320

According to the definition of TextPrompt/TokensPrompt, multi_modal_data is marked as NotRequired[MultiModalDataDict], which means that if the item exists in the dictionary, it must be MultiModalDataDict, not None. However in vllm bench serve the data is passed as None which results in this error.

cc @huydhn

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@DarkLight1337 DarkLight1337 requested a review from mgoin October 8, 2025 03:21
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 8, 2025
@mergify mergify bot added the performance Performance-related issues label Oct 8, 2025
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Comment on lines 203 to 206
for request in requests:
prompts.append(
TokensPrompt(
prompt_token_ids=request.prompt["prompt_token_ids"],
multi_modal_data=request.multi_modal_data,
)
prompt = (
TokensPrompt(prompt_token_ids=request.prompt["prompt_token_ids"])
if "prompt_token_ids" in request.prompt

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Async throughput benchmark never sends any prompts

Inside the request loop a prompt object is built but never appended to prompts. The later loop for i, (prompt, sp, lr) in enumerate(zip(prompts, sampling_params, lora_requests)): therefore iterates zero times, so no calls to llm.generate are issued and the benchmark completes without exercising the engine, returning a meaningless near‑zero latency. This regresses all vllm bench throughput runs that rely on run_vllm_async.

Useful? React with 👍 / 👎.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug where multi_modal_data could be incorrectly passed as None in vllm bench throughput, causing an error. The change in run_vllm_async correctly avoids this by only setting multi_modal_data if it has a value. However, the added assertion to check the type of multi_modal_data is too restrictive and could cause crashes with certain datasets. A similar bug of passing None also appears to exist in the run_vllm function, which might be worth addressing for consistency.

@DarkLight1337 DarkLight1337 merged commit 0d4f48f into vllm-project:main Oct 8, 2025
50 of 51 checks passed
@DarkLight1337 DarkLight1337 deleted the fix-benchmark-mm branch October 8, 2025 05:52
mrasquinha-g pushed a commit to mrasquinha-g/vllm that referenced this pull request Oct 9, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
sducouedic pushed a commit to sducouedic/vllm that referenced this pull request Oct 16, 2025
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 vllm bench throughput regression on 2.9 RC on B200

2 participants