[RFC]: vLLM plugin system

### Motivation.

There is an increasing need to customize vLLM, including:
- out-of-tree model registration, where users want to register their model outside of vLLM repo. This is partially fulfilled by https://github.com/vllm-project/vllm/pull/3871 . But later users find that it does not work in distributed setting with `ray`: https://github.com/vllm-project/vllm/issues/5657
- custom executor class, already added in https://github.com/vllm-project/vllm/pull/6557
- custom scheduler, requested in https://github.com/vllm-project/vllm/issues/7123
- custom tensor parallel implementation, requested in https://github.com/vllm-project/vllm/issues/7124

Usually, the request is to swap out some functions / classes in vLLM, or call some functions before vLLM runs the model. While implementing them in vLLM is not difficult, the maintenaince burden grows.

In order to satisfy the growing need of customization, I propose to introduce vLLM plugin system.

It is inspired by the pytest [community](https://github.com/pytest-dev/pytest), where a plugin is a standalone pypi package, e.g. https://pypi.org/project/pytest-forked/ .

https://github.com/vllm-project/vllm/pull/7130 is a draft implementation, where I added a new env var `VLLM_PLUGINS`. The way it works, is similar to the operating system's [`LD_PRELOAD`](https://man7.org/linux/man-pages/man8/ld.so.8.html), with a colon-separated list of python modules to import.

One of the most important concern, is to fight against [arbitrary code execution](https://en.wikipedia.org/wiki/Arbitrary_code_execution) risk. When a user serves a model using vLLM, the endpoint user cannot activate the plugin, so this does not suffer from code injection risk. However, there is indeed a risk, if the user runs vLLM in an untrusted environment. In this case:
- we require the plugin package name starts with `vllm_` , so that vLLM user does not accidentally add irrelevant modules to execute.
- we explicitly log the plugin module vLLM is using, so that vLLM user can easily see if any unexpected code is executed.

With these efforts, the security level should be the same as `LD_PRELOAD`. And since `LD_PRELOAD` exists for so many years, I think `VLLM_PLUGINS` should be acceptable in terms of security risk.

### Proposed Change.

see https://github.com/vllm-project/vllm/pull/7130 for the draft implementation

### Feedback Period.

_No response_

### CC List.

_No response_

### Any Other Things.

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[RFC]: vLLM plugin system #7131

Motivation.

Proposed Change.

Feedback Period.

CC List.

Any Other Things.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[RFC]: vLLM plugin system #7131

Description

Motivation.

Proposed Change.

Feedback Period.

CC List.

Any Other Things.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions