-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
Closed
Labels
bugSomething isn't workingSomething isn't workingunstaleRecieved activity after being labelled staleRecieved activity after being labelled stale
Description
Using the API server and submitting multiple prompts to take advantage of speed benefit returns the following error:
"multiple prompts in a batch is not currently supported"
What's the point of vLLM without being able to send batches to the API?
Of course, I can send multiple seperate requests, but those are handled sequentially and do not benefit from speed improvements.
Correct me if I'm wrong...
tom-doerr, MichaelRes, LilianJim and HillZhang1999tom-doerr, MichaelRes, LilianJim and HillZhang1999
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingunstaleRecieved activity after being labelled staleRecieved activity after being labelled stale