From 2c2f1aeb254ecf97233525a6d7b64f385733a39b Mon Sep 17 00:00:00 2001
From: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Date: Mon, 8 Sep 2025 16:35:22 +0000
Subject: [PATCH 1/3] update bench doc with mtbench, blazedit, spec bench
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
---
benchmarks/README.md | 70 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/benchmarks/README.md b/benchmarks/README.md
index 98b3600d1363..5ec8cf5c9bbb 100644
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -95,6 +95,30 @@ become available.
✅ |
lmms-lab/LLaVA-OneVision-Data, Aeala/ShareGPT_Vicuna_unfiltered |
+
+ | HuggingFace-MTBench |
+ ✅ |
+ ✅ |
+ philschmid/mt-bench |
+
+
+ | HuggingFace-MTBench |
+ ✅ |
+ ✅ |
+ philschmid/mt-bench |
+
+
+ | HuggingFace-Blazedit |
+ ✅ |
+ ✅ |
+ vdaita/edit_5k_char, vdaita/edit_10k_char |
+
+
+ | Spec Bench |
+ ✅ |
+ ✅ |
+ wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl |
+
| Custom |
✅ |
@@ -239,6 +263,40 @@ vllm bench serve \
--num-prompts 2048
```
+### Spec Bench Benchmark with Speculative Decoding
+
+``` bash
+VLLM_USE_V1=1 vllm serve meta-llama/Meta-Llama-3-8B-Instruct \
+ --speculative-config $'{"method": "ngram",
+ "num_speculative_tokens": 5, "prompt_lookup_max": 5,
+ "prompt_lookup_min": 2}'
+```
+
+SpecBench dataset: https://github.com/hemingkx/Spec-Bench
+Download the dataset using:
+ wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl
+
+Run all categories
+``` bash
+vllm bench serve \
+ --model meta-llama/Meta-Llama-3-8B-Instruct \
+ --dataset-name spec_bench \
+ --dataset-path "/data/spec_bench/question.jsonl" \
+ --num-prompts -1
+```
+
+Available categories include [writing, roleplay, reasoning, math, coding, extraction, stem, humanities, translation, summarization, qa, math_reasoning, rag].
+Run only a specific category like "summarization".
+
+``` bash
+vllm bench serve \
+ --model meta-llama/Meta-Llama-3-8B-Instruct \
+ --dataset-name spec_bench \
+ --dataset-path "/data/spec_bench/question.jsonl" \
+ --num-prompts -1
+ --spec-bench-category "summarization"
+```
+
### Other HuggingFaceDataset Examples
```bash
@@ -295,6 +353,18 @@ vllm bench serve \
--num-prompts 80
```
+`vdaita/edit_5k_char` or `vdaita/edit_10k_char`:
+
+``` bash
+vllm bench serve \
+ --model Qwen/QwQ-32B \
+ --dataset-name hf \
+ --dataset-path vdaita/edit_5k_char \
+ --num-prompts 90 \
+ --blazedit-min-distance 0.01 \
+ --blazedit-max-distance 0.99
+```
+
### Running With Sampling Parameters
When using OpenAI-compatible backends such as `vllm`, optional sampling
From 54eed195aa1601471c3b1280eec31a347584b3c2 Mon Sep 17 00:00:00 2001
From: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Date: Mon, 8 Sep 2025 16:37:15 +0000
Subject: [PATCH 2/3] fix
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
---
benchmarks/README.md | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/benchmarks/README.md b/benchmarks/README.md
index 5ec8cf5c9bbb..a75382a87a67 100644
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -272,11 +272,12 @@ VLLM_USE_V1=1 vllm serve meta-llama/Meta-Llama-3-8B-Instruct \
"prompt_lookup_min": 2}'
```
-SpecBench dataset: https://github.com/hemingkx/Spec-Bench
+SpecBench dataset: https://github.com/hemingkx/Spec-Bench .
+
Download the dataset using:
wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl
-Run all categories
+Run all categories:
``` bash
vllm bench serve \
--model meta-llama/Meta-Llama-3-8B-Instruct \
@@ -285,8 +286,9 @@ vllm bench serve \
--num-prompts -1
```
-Available categories include [writing, roleplay, reasoning, math, coding, extraction, stem, humanities, translation, summarization, qa, math_reasoning, rag].
-Run only a specific category like "summarization".
+Available categories include `[writing, roleplay, reasoning, math, coding, extraction, stem, humanities, translation, summarization, qa, math_reasoning, rag]`.
+
+Run only a specific category like "summarization":
``` bash
vllm bench serve \
From 3b5b7d811ec5bf3395b0caec2744ec3b00f99086 Mon Sep 17 00:00:00 2001
From: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Date: Mon, 8 Sep 2025 17:13:57 +0000
Subject: [PATCH 3/3] fix lint
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
---
benchmarks/README.md | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/benchmarks/README.md b/benchmarks/README.md
index a75382a87a67..9cf9f6ab302b 100644
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -101,12 +101,6 @@ become available.
✅ |
philschmid/mt-bench |
-
- | HuggingFace-MTBench |
- ✅ |
- ✅ |
- philschmid/mt-bench |
-
| HuggingFace-Blazedit |
✅ |
@@ -272,13 +266,14 @@ VLLM_USE_V1=1 vllm serve meta-llama/Meta-Llama-3-8B-Instruct \
"prompt_lookup_min": 2}'
```
-SpecBench dataset: https://github.com/hemingkx/Spec-Bench .
-
-Download the dataset using:
- wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl
+[SpecBench dataset](https://github.com/hemingkx/Spec-Bench)
Run all categories:
+
``` bash
+# Download the dataset using:
+# wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl
+
vllm bench serve \
--model meta-llama/Meta-Llama-3-8B-Instruct \
--dataset-name spec_bench \