[Feature]: Expose option to load new model weights from disk

### 🚀 The feature, motivation and pitch

In an async RL setting, we often want to perform fast generation with a vllm endpoint on a separate node and occasionally sync model weights from disk. It would be good if this option was available on the vllm endpoint. 

### Alternatives

SGLang already exposes this option: https://docs.sglang.ai/backend/native_api.html#Update-Weights-From-Disk


### Additional context

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Expose option to load new model weights from disk #12774

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Expose option to load new model weights from disk #12774

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions