-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Description
🚀 The feature, motivation and pitch
On vLLM we have two main benchmark scripts (benchmark_throughput.py and benchmark_serving.py) to measure the performance of vLLM.
However, the dataset sampling functions are defined within each script itself and over time it'll be hard to maintain these and to add new datasets to both scripts as we want to have the flexibility to run benchmark on different datasets.
Alternatives
Ideally the dataset sampling should be defined in a separate file (e.g, benchmark_dataset.py) where we define the sampling functions for different datasets (sharegpt, sonnet, random, vision arena, etc), and the benchmark scripts themselves can simply import from benchmark_dataset depending on which dataset is specified at command line.
This modularization brings us a number of benefits:
- Ensure the alignment of dataset sampling between the two benchmarks in case we want to compare the performance between online serving and offline inference.
- Ease the process of adding new types of benchmark datasets.
- Open up the opportunity to support user-defined custom datasets as long as they conform to a format that we pre-define.
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status