Skip to content

Conversation

@hmellor
Copy link
Member

@hmellor hmellor commented Jul 28, 2025

Previously, embedding model checkpoints that had their layers at the root of the checkpoint would not load correctly with the Transformers backend.

This PR enables the loading of Transformers base model classes.

Now, both of the following formats of checkpoint will work for pooling tasks:

ModelForCausalLM:

- model
  - layers
    - ...
  - lm_head

Model:

- layers
  - ...
- ...

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@hmellor hmellor changed the title Enable headless models for embedding in the Transformers backend Enable headless models for pooling in the Transformers backend Jul 28, 2025
@mergify mergify bot added the new-model Requests to new models label Jul 28, 2025
@hmellor hmellor requested a review from Isotr0py July 28, 2025 15:08
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables loading headless models for embedding with the Transformers backend, which is a great addition. The changes in the configuration and registry look correct, and the new test case covers the intended scenarios.

However, I've found a critical issue in the WeightsMapper implementation for the new TransformersModel. The current logic for prefixing weights is flawed due to incorrect key ordering in the dictionary, which will cause weight loading to fail for one of the model formats this PR aims to support. I've provided a detailed comment and a code suggestion to fix this.

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for extending this support!

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) July 28, 2025 15:25
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 28, 2025
@hmellor hmellor disabled auto-merge July 29, 2025 17:13
@hmellor
Copy link
Member Author

hmellor commented Jul 29, 2025

The mapper is having issues, I'll disable auto-merge for now

@hmellor hmellor enabled auto-merge (squash) August 1, 2025 16:54
@vllm-bot vllm-bot merged commit 38c8bce into vllm-project:main Aug 1, 2025
41 of 44 checks passed
@hmellor hmellor deleted the transformers-backend-base-model-loading branch August 2, 2025 08:59
npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025
jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025
noamgat pushed a commit to noamgat/vllm that referenced this pull request Aug 9, 2025
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
@hmellor hmellor moved this to Done in Transformers backend Sep 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants