-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
The output of `python collect_env.py`
Your output of `python collect_env.py` here
🐛 Describe the bug
@robertgshaw2-neuralmagic, @njhill
I am running vllm @ 6653040 which includes #7394.
Reproducer:
# vllm serve meta-llama/Meta-Llama-3-8B-Instruct --disable-log-requests
import openai
import asyncio
N = 800
client = openai.AsyncOpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
async def generate_streaming(prompt: str):
async for req_output in await client.completions.create(
model="meta-llama/Meta-Llama-3-8B-Instruct",
prompt=prompt,
stream=True,
):
yield req_output.choices[0].text
async def generate_output(prompt: str):
async for output in generate_streaming(prompt):
final_output = output
return final_output
async def main():
prompts = [str(i) for i in range(N)]
async with asyncio.TaskGroup() as tg:
tasks = [tg.create_task(generate_output(prompt)) for prompt in prompts]
asyncio.run(main())
Error message:
| Traceback (most recent call last):
| File ".venv/lib/python3.11/site-packages/starlette/responses.py", line 261, in wrap
| await func()
| File ".venv/lib/python3.11/site-packages/starlette/responses.py", line 250, in stream_response
| async for chunk in self.body_iterator:
| File "vllm/vllm/entrypoints/openai/serving_completion.py", line 231, in completion_stream_generator
| async for prompt_idx, res in result_generator:
| File "vllm/vllm/utils.py", line 468, in merge_async_iterators
| item = await d
| ^^^^^^^
| File "vllm/vllm/entrypoints/openai/rpc/client.py", line 424, in generate
| await self.abort(request_id)
| File "vllm/vllm/entrypoints/openai/rpc/client.py", line 350, in abort
| await self._send_one_way_rpc_request(
| File "vllm/vllm/entrypoints/openai/rpc/client.py", line 256, in _send_one_way_rpc_request
| with self.to_proxy_socket() as socket:
| File "/usr/lib/python3.11/contextlib.py", line 137, in __enter__
| return next(self.gen)
| ^^^^^^^^^^^^^^
| File "vllm/vllm/entrypoints/openai/rpc/client.py", line 195, in to_proxy_socket
| socket = self.context.socket(zmq.constants.DEALER)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File ".venv/lib/python3.11/site-packages/zmq/sugar/context.py", line 354, in socket
| socket_class( # set PYTHONTRACEMALLOC=2 to get the calling frame
| File ".venv/lib/python3.11/site-packages/zmq/_future.py", line 218, in __init__
| super().__init__(context, socket_type, **kwargs) # type: ignore
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File ".venv/lib/python3.11/site-packages/zmq/sugar/socket.py", line 156, in __init__
| super().__init__(
| File "_zmq.py", line 690, in zmq.backend.cython._zmq.Socket.__init__
| zmq.error.ZMQError: Too many open files
This arguably is not normal online serving traffic. With that said, if --disable-frontend-multiprocessing is on, the server can handle N=8192 with no issue.
strace shows lots of eventfd, which might be related to https://www.mail-archive.com/[email protected]/msg31244.html
730059 eventfd2(0, EFD_CLOEXEC) = 976
730059 eventfd2(0, EFD_CLOEXEC) = 977
730059 eventfd2(0, EFD_CLOEXEC) = 978
730059 eventfd2(0, EFD_CLOEXEC) = 979
730059 eventfd2(0, EFD_CLOEXEC) = 980
730059 eventfd2(0, EFD_CLOEXEC) = 981
730059 eventfd2(0, EFD_CLOEXEC) = 982
730059 eventfd2(0, EFD_CLOEXEC) = 983
730059 eventfd2(0, EFD_CLOEXEC) = 984
730059 eventfd2(0, EFD_CLOEXEC) = 985
730059 eventfd2(0, EFD_CLOEXEC) = 986
730059 eventfd2(0, EFD_CLOEXEC) = 987
730059 eventfd2(0, EFD_CLOEXEC) = 988
730059 eventfd2(0, EFD_CLOEXEC) = 989
730059 eventfd2(0, EFD_CLOEXEC) = 990
730059 eventfd2(0, EFD_CLOEXEC) = 991
730059 eventfd2(0, EFD_CLOEXEC) = 992
730059 eventfd2(0, EFD_CLOEXEC) = 993
730059 eventfd2(0, EFD_CLOEXEC) = 994
730059 eventfd2(0, EFD_CLOEXEC) = 995
730059 eventfd2(0, EFD_CLOEXEC) = 996
730059 eventfd2(0, EFD_CLOEXEC) = 997
730059 eventfd2(0, EFD_CLOEXEC) = 998
730059 eventfd2(0, EFD_CLOEXEC) = 999
730059 eventfd2(0, EFD_CLOEXEC) = 1000
730059 eventfd2(0, EFD_CLOEXEC) = 1001
730059 eventfd2(0, EFD_CLOEXEC) = 1002
730059 eventfd2(0, EFD_CLOEXEC) = 1003
730059 eventfd2(0, EFD_CLOEXEC <unfinished ...>
730059 <... eventfd2 resumed>) = 1004
730059 eventfd2(0, EFD_CLOEXEC) = 1005
730059 eventfd2(0, EFD_CLOEXEC) = 1006
730059 eventfd2(0, EFD_CLOEXEC) = 1007
730059 eventfd2(0, EFD_CLOEXEC) = 1008
730059 eventfd2(0, EFD_CLOEXEC) = 1009
730059 eventfd2(0, EFD_CLOEXEC) = 1010
730059 eventfd2(0, EFD_CLOEXEC) = 1011
730059 eventfd2(0, EFD_CLOEXEC) = 1012
730059 eventfd2(0, EFD_CLOEXEC) = 1013
730059 eventfd2(0, EFD_CLOEXEC) = 1014
730059 eventfd2(0, EFD_CLOEXEC) = 1015
730059 eventfd2(0, EFD_CLOEXEC) = 1016
730059 eventfd2(0, EFD_CLOEXEC) = 1017
730059 eventfd2(0, EFD_CLOEXEC) = 1018
730059 eventfd2(0, EFD_CLOEXEC) = 1019
730059 eventfd2(0, EFD_CLOEXEC) = 1020
730059 eventfd2(0, EFD_CLOEXEC) = 1021
730059 eventfd2(0, EFD_CLOEXEC) = 1022
730059 eventfd2(0, EFD_CLOEXEC) = 1023
730059 eventfd2(0, EFD_CLOEXEC) = -1 EMFILE (Too many open files)
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working