-
Notifications
You must be signed in to change notification settings - Fork 174
chore: update transformers to support Qwen3 architecture #620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: update transformers to support Qwen3 architecture #620
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
Updates transformers dependency in libs/infinity_emb/pyproject.toml to support Qwen3 architecture by raising minimum version to 4.51.3.
- Updated transformers dependency from
>=4.47.0,<=5.0to>=4.51.3,<=5.0while maintaining backward compatibility - Missing test coverage for Qwen3 model support despite being a checklist item
- No documentation updates in docs folder to reflect new model support
- Poetry.lock updated but changes not included in the review context
1 file reviewed, no comments
Edit PR Review Bot Settings | Greptile
|
can we please merge this @michaelfeil |
|
Only updating transformers can not solve the problem and cannot still deploy Qwen3-reranker-8B. see #611 (comment) |
try using tomaarsen/Qwen3-Reranker-4B-seq-cls |
What is the difference between tomaarsen/Qwen3-Reranker-8B-seq-cls and official Qwen/Qwen3-reranker-8B? thanks. |
|
The difference is mentioned here: https://huggingface.co/Qwen/Qwen3-Reranker-0.6B/discussions/3 Apparently, the original I don't think infinity has a way to support the reranker as of now through |
I was able to get it working on modal https://gist.github.com/aryasaatvik/6f868d5f064b25e41cf1e20018721dca |
The vllm 0.9.2 already support to deploy Qwen3-Reranker-8B and Qwen3-Embedding-8B. |
|
A update of the transformer libary will break all
There is a large undertaking to migrate these libaries. Migrating bert from bettertransformer to e.g. torch + compile will lead to a 50% performance loss in throughput and latency. |
Related Issue
Adds support for Qwen3 model architecture.
Summary
This PR updates the transformers dependency to ensure compatibility with Qwen3 models.
Changes:
>=4.51.3,<=5.0Why this is needed:
Checklist
Additional Notes
This is a dependency update that maintains backward compatibility.