-
Notifications
You must be signed in to change notification settings - Fork 88
Description
Describe the bug
When a concurrent benchmark is initiated (e.g., with --rate-type=concurrent), the main process correctly creates and validates the backend.
However, when the worker processes are subsequently spawned for the test, each worker process calls the backend.validate() method again.
This results in multiple, redundant "Test connection" network requests being sent to the target endpoint: one for the main process, and one for each worker process.
Expected behavior
The backend validation should only be performed once by the main process before the workers are created. Worker processes, which receive a copy of the already-validated backend object, should not perform this validation again.
This would prevent unnecessary network requests, reduce benchmark startup latency, and make the logs cleaner.
Environment
Include all relevant environment information:
- OS [e.g. Ubuntu 20.04]:
- Python version [e.g. 3.12.2]:
To Reproduce
guidellm benchmark run
--target "http://your-api-endpoint/v1"
--rate-type=concurrent
--rate=2
--max-requests 2
--data='{"prompt_tokens":32,"output_tokens":32}'
Observe the console output. You will see the log message validating backend and the associated "Test connection" request logs appear multiple
times (once for the main process, and once for each of the 2 worker processes).
Errors
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.
Additional context
Add any other context about the problem here. Also include any relevant files.