Skip to content

Conversation

ekagra-ranjan
Copy link
Contributor

@ekagra-ranjan ekagra-ranjan commented Aug 25, 2025

  • Adds Spec Bench Dataset https://github.com/hemingkx/Spec-Bench to vLLM benchmark suite.
  • Can benchmark on a specific category only, e.g., --spec-bench-category "summarization".
  • It leverages CustomDataset sampling to reduce code.
  • Setting num prompt <=0 in dataset that inherit CustomDataset will load all the data

Test

cmd
time VLLM_USE_HYBRID_MEM=0 VLLM_USE_V1=1 python3 examples/offline_inference/spec_decode.py --method eagle --num_spec_tokens 3 --tp 1 --dataset-name spec_bench --dataset-path "/host/vllm-cohere/data/spec_bench/question.jsonl" --num-prompts -1 --print-output

Output

--------------------------------------------------
--------------------------------------------------
total_num_output_tokens: 68355
num_drafts: 31214
num_draft_tokens: 93642
num_accepted_tokens: 37294
mean acceptance length: 2.19
--------------------------------------------------
acceptance at token 0: 0.66
acceptance at token 1: 0.36
acceptance at token 2: 0.17

cmd
time VLLM_USE_HYBRID_MEM=0 VLLM_USE_V1=1 python3 examples/offline_inference/spec_decode.py --method eagle --num_spec_tokens 3 --tp 1 --dataset-name spec_bench --dataset-path "/host/vllm-cohere/data/spec_bench/question.jsonl" --num-prompts -1 --print-output --spec-bench-category "summarization"

Output

--------------------------------------------------
--------------------------------------------------
total_num_output_tokens: 17626
num_drafts: 8433
num_draft_tokens: 25299
num_accepted_tokens: 9193
mean acceptance length: 2.09
--------------------------------------------------
acceptance at token 0: 0.65
acceptance at token 1: 0.31
acceptance at token 2: 0.13

cmd
time VLLM_USE_HYBRID_MEM=0 VLLM_USE_V1=1 python3 examples/offline_inference/spec_decode.py --method eagle --num_spec_tokens 3 --tp 1 --dataset-name spec_bench --dataset-path "/host/vllm-cohere/data/spec_bench/question.jsonl" --num-prompts -1 --print-output --spec-bench-category "math_reasoning"

Output

--------------------------------------------------
--------------------------------------------------
total_num_output_tokens: 12641
num_drafts: 4963
num_draft_tokens: 14889
num_accepted_tokens: 7701
mean acceptance length: 2.55
--------------------------------------------------
acceptance at token 0: 0.78
acceptance at token 1: 0.50
acceptance at token 2: 0.27

@mergify mergify bot added the performance Performance-related issues label Aug 25, 2025
@mergify
Copy link

mergify bot commented Aug 25, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ekagra-ranjan.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Aug 25, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the Spec Bench dataset, which is a valuable addition for benchmarking. The implementation leverages the existing CustomDataset class effectively. However, I've identified two high-severity issues that should be addressed. First, a variable name for an argument group is reused, which is confusing and could lead to bugs. Second, the load_data method is called twice when initializing a SpecBench object, leading to unnecessary overhead. The provided code suggestions aim to fix these issues.

@mergify
Copy link

mergify bot commented Aug 26, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ekagra-ranjan.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Aug 26, 2025
@keyboardAnt
Copy link

@RoyNissim, you might find this PR relevant to your recent efforts in advancing and standardizing benchmarking.

@mergify mergify bot removed the needs-rebase label Sep 3, 2025
@ekagra-ranjan ekagra-ranjan changed the title [Spec Dec][Benchmark] Add Spec Bench Dataset for benchmarking [Spec Decode][Benchmark] Add Spec Bench Dataset for benchmarking Sep 3, 2025
Copy link
Member

@ywang96 ywang96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution - Could you also create a section under https://github.com/vllm-project/vllm/blob/main/benchmarks/README.md specifically for spec decode if we're going to have multiple benchmark datasets under this category? (This can be in a follow-up PR)

@ywang96 ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 3, 2025
@ywang96 ywang96 enabled auto-merge (squash) September 5, 2025 16:52
Signed-off-by: Ekagra Ranjan <[email protected]>
auto-merge was automatically disabled September 5, 2025 19:10

Head branch was pushed to by a user without write access

@mergify
Copy link

mergify bot commented Sep 5, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ekagra-ranjan.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Sep 5, 2025
@mergify mergify bot removed the needs-rebase label Sep 5, 2025
@ywang96 ywang96 merged commit 3feeeb9 into vllm-project:main Sep 8, 2025
38 checks passed
eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025
skyloevil pushed a commit to skyloevil/vllm that referenced this pull request Sep 13, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
sducouedic pushed a commit to sducouedic/vllm that referenced this pull request Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants