From 2c2f1aeb254ecf97233525a6d7b64f385733a39b Mon Sep 17 00:00:00 2001 From: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Date: Mon, 8 Sep 2025 16:35:22 +0000 Subject: [PATCH 1/3] update bench doc with mtbench, blazedit, spec bench Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> --- benchmarks/README.md | 70 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/benchmarks/README.md b/benchmarks/README.md index 98b3600d1363..5ec8cf5c9bbb 100644 --- a/benchmarks/README.md +++ b/benchmarks/README.md @@ -95,6 +95,30 @@ become available. ✅ lmms-lab/LLaVA-OneVision-Data, Aeala/ShareGPT_Vicuna_unfiltered + + HuggingFace-MTBench + ✅ + ✅ + philschmid/mt-bench + + + HuggingFace-MTBench + ✅ + ✅ + philschmid/mt-bench + + + HuggingFace-Blazedit + ✅ + ✅ + vdaita/edit_5k_char, vdaita/edit_10k_char + + + Spec Bench + ✅ + ✅ + wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl + Custom ✅ @@ -239,6 +263,40 @@ vllm bench serve \ --num-prompts 2048 ``` +### Spec Bench Benchmark with Speculative Decoding + +``` bash +VLLM_USE_V1=1 vllm serve meta-llama/Meta-Llama-3-8B-Instruct \ + --speculative-config $'{"method": "ngram", + "num_speculative_tokens": 5, "prompt_lookup_max": 5, + "prompt_lookup_min": 2}' +``` + +SpecBench dataset: https://github.com/hemingkx/Spec-Bench +Download the dataset using: + wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl + +Run all categories +``` bash +vllm bench serve \ + --model meta-llama/Meta-Llama-3-8B-Instruct \ + --dataset-name spec_bench \ + --dataset-path "/data/spec_bench/question.jsonl" \ + --num-prompts -1 +``` + +Available categories include [writing, roleplay, reasoning, math, coding, extraction, stem, humanities, translation, summarization, qa, math_reasoning, rag]. +Run only a specific category like "summarization". + +``` bash +vllm bench serve \ + --model meta-llama/Meta-Llama-3-8B-Instruct \ + --dataset-name spec_bench \ + --dataset-path "/data/spec_bench/question.jsonl" \ + --num-prompts -1 + --spec-bench-category "summarization" +``` + ### Other HuggingFaceDataset Examples ```bash @@ -295,6 +353,18 @@ vllm bench serve \ --num-prompts 80 ``` +`vdaita/edit_5k_char` or `vdaita/edit_10k_char`: + +``` bash +vllm bench serve \ + --model Qwen/QwQ-32B \ + --dataset-name hf \ + --dataset-path vdaita/edit_5k_char \ + --num-prompts 90 \ + --blazedit-min-distance 0.01 \ + --blazedit-max-distance 0.99 +``` + ### Running With Sampling Parameters When using OpenAI-compatible backends such as `vllm`, optional sampling From 54eed195aa1601471c3b1280eec31a347584b3c2 Mon Sep 17 00:00:00 2001 From: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Date: Mon, 8 Sep 2025 16:37:15 +0000 Subject: [PATCH 2/3] fix Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> --- benchmarks/README.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/benchmarks/README.md b/benchmarks/README.md index 5ec8cf5c9bbb..a75382a87a67 100644 --- a/benchmarks/README.md +++ b/benchmarks/README.md @@ -272,11 +272,12 @@ VLLM_USE_V1=1 vllm serve meta-llama/Meta-Llama-3-8B-Instruct \ "prompt_lookup_min": 2}' ``` -SpecBench dataset: https://github.com/hemingkx/Spec-Bench +SpecBench dataset: https://github.com/hemingkx/Spec-Bench . + Download the dataset using: wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl -Run all categories +Run all categories: ``` bash vllm bench serve \ --model meta-llama/Meta-Llama-3-8B-Instruct \ @@ -285,8 +286,9 @@ vllm bench serve \ --num-prompts -1 ``` -Available categories include [writing, roleplay, reasoning, math, coding, extraction, stem, humanities, translation, summarization, qa, math_reasoning, rag]. -Run only a specific category like "summarization". +Available categories include `[writing, roleplay, reasoning, math, coding, extraction, stem, humanities, translation, summarization, qa, math_reasoning, rag]`. + +Run only a specific category like "summarization": ``` bash vllm bench serve \ From 3b5b7d811ec5bf3395b0caec2744ec3b00f99086 Mon Sep 17 00:00:00 2001 From: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Date: Mon, 8 Sep 2025 17:13:57 +0000 Subject: [PATCH 3/3] fix lint Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> --- benchmarks/README.md | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/benchmarks/README.md b/benchmarks/README.md index a75382a87a67..9cf9f6ab302b 100644 --- a/benchmarks/README.md +++ b/benchmarks/README.md @@ -101,12 +101,6 @@ become available. ✅ philschmid/mt-bench - - HuggingFace-MTBench - ✅ - ✅ - philschmid/mt-bench - HuggingFace-Blazedit ✅ @@ -272,13 +266,14 @@ VLLM_USE_V1=1 vllm serve meta-llama/Meta-Llama-3-8B-Instruct \ "prompt_lookup_min": 2}' ``` -SpecBench dataset: https://github.com/hemingkx/Spec-Bench . - -Download the dataset using: - wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl +[SpecBench dataset](https://github.com/hemingkx/Spec-Bench) Run all categories: + ``` bash +# Download the dataset using: +# wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl + vllm bench serve \ --model meta-llama/Meta-Llama-3-8B-Instruct \ --dataset-name spec_bench \