-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
[Benchmark][Doc] Update throughput benchmark and README #15998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Benchmark][Doc] Update throughput benchmark and README #15998
Conversation
Signed-off-by: StevenShi-23 <[email protected]>
Signed-off-by: StevenShi-23 <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
benchmarks/benchmark_throughput.py
Outdated
|
||
elif args.dataset_path in AIMODataset.SUPPORTED_DATASET_PATHS: | ||
dataset_cls = AIMODataset | ||
common_kwargs['dataset_subset'] = args.hf_subset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if dataset_subset
is not needed by this dataset you can put it as None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the comments. I leave it because hf_subset
by default is None. I'll explicit set it to None to avoid error.
--backend vllm \ | ||
--dataset-name hf \ | ||
--dataset-path AI-MO/aimo-validation-aime \ | ||
--hf-split train \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this line --hf-split train \
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Thank you! As a precaution, could you run both online and offline with the same seed to see if the token counts match? |
I ran the offline and online benchmark with the same seed, and the input token count did not match (1204 vs 1229). But the token count is reproducible within offline or online benchmark itself using the same seed. I think it may deserve a separate PR to investigate it. |
Signed-off-by: StevenShi-23 <[email protected]>
benchmarks/benchmark_throughput.py
Outdated
elif args.dataset_path in AIMODataset.SUPPORTED_DATASET_PATHS: | ||
assert args.backend == "vllm", "AIMODataset needs to use vllm as the backend." #noqa: E501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this if elif
is growing quite long so let me push a change to make it cleaner
Signed-off-by: Roger Wang <[email protected]>
…#15998) Signed-off-by: StevenShi-23 <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Signed-off-by: xinyuxiao <[email protected]>
…#15998) Signed-off-by: StevenShi-23 <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
…#15998) Signed-off-by: StevenShi-23 <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]>
…#15998) Signed-off-by: StevenShi-23 <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]>
…#15998) Signed-off-by: StevenShi-23 <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Signed-off-by: Mu Huai <[email protected]>
This PR updates benchmark README for the latest change from #15955 . Also, it adds AIMODataset to
benchmark_throughput.py
@JenZhao @ywang96 Thank you for the comments!