[Bug]: When tensor_parallel_size>1,  RuntimeError: Cannot re-initialize CUDA in forked subprocess.

### Your current environment

vllm version: '0.5.0.post1'



### 🐛 Describe the bug

When I set tensor_parallel_size=1, it works well.
But, if I set tensor_parallel_size>1, below error occurs:
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method.
After I add
```
import torch
import multiprocessing
torch.multiprocessing.set_start_method('spawn')
```
the same RuntimeError still occurs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: When tensor_parallel_size>1, RuntimeError: Cannot re-initialize CUDA in forked subprocess. #6152

Your current environment

🐛 Describe the bug

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: When tensor_parallel_size>1, RuntimeError: Cannot re-initialize CUDA in forked subprocess. #6152

Description

Your current environment

🐛 Describe the bug

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions