[Feature]: Composite model loading using `AutoWeightsLoader` for all models

### 🚀 The feature, motivation and pitch

#9160 first introduced `AutoWeightsLoader` to recursively call `load_weights` on sub-modules. This lets composite models (most notably multi-modal models) use language backbones (`*Model` classes such as `LlamaModel`) without having to repeat their weight loading logic.

Currently, `load_weights` is only implemented in a few language backbones. It would be great to standardize this approach and apply it to all language backbones in vLLM. The steps to do this are pretty straightforward:

1. Move the existing `load_weights` function from `*ForCausalLM` to `*Model`.
2. Create a new `load_weights` function in `*ForCausalLM` that loads the weights using `AutoWeightsLoader`.
3. Move any logic in `*Model.load_weights` that only applies to `*ForCausalLM` back to `*ForCausalLM.load_weights`. Usually, this involves `lm_head`.

For reference, you can look at the implementation for models such as Llama, Gemma2/3, Qwen2 and ChatGLM.

To avoid scope creep, I suggest opening a PR for updating only a few models at a time

### Alternatives

_No response_

### Additional context

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Composite model loading using `AutoWeightsLoader` for all models #15697

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Composite model loading using AutoWeightsLoader for all models #15697

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Feature]: Composite model loading using `AutoWeightsLoader` for all models #15697