-
Notifications
You must be signed in to change notification settings - Fork 87
Description
Describe the bug
When running a benchmark with synthetic data, the number of samples generated defaults to 1000. If a user specifies a small number of total requests (e.g., --max-requests=10), the request loader still generates the full 1000 samples. This results in a very long and unexpected delay at the "Creating request loader..." step, even for small benchmark runs.
Expected behavior
When using synthetic data, if the samples parameter is not explicitly set in the --data configuration, the number of samples generated should
intelligently default to the value of --max-requests if it is provided.
For example, if a user runs a benchmark with --max-requests=10, the request loader should only generate 10 synthetic samples, not 1000. This would
make the startup time for small tests significantly faster and the tool's behavior more intuitive.
Environment
Include all relevant environment information:
- OS [e.g. Ubuntu 20.04]:
- Python version [e.g. 3.12.2]:
To Reproduce
guidellm benchmark run --max-requests 50 --data='{"prompt_tokens":100, "output_tokens":100}'
Before this PR: Observe the long delay and the log message: Created loader with 1000 unique requests...
Errors
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.
Additional context
Add any other context about the problem here. Also include any relevant files.