chore: update transformers to support Qwen3 architecture #620

aryasaatvik · 2025-07-10T12:27:44Z

Related Issue

Adds support for Qwen3 model architecture.

Summary

This PR updates the transformers dependency to ensure compatibility with Qwen3 models.

Changes:

Updated transformers version constraint in pyproject.toml to >=4.51.3,<=5.0
Regenerated poetry.lock with updated dependencies

Why this is needed:

Enables users to serve Qwen3 models
Keeps dependencies up to date with latest model architectures

Checklist

I have read the CONTRIBUTING guidelines.
I have added tests to cover my changes.
I have updated the documentation (docs folder) accordingly.

Additional Notes

This is a dependency update that maintains backward compatibility.

greptile-apps

PR Summary

Updates transformers dependency in libs/infinity_emb/pyproject.toml to support Qwen3 architecture by raising minimum version to 4.51.3.

Updated transformers dependency from >=4.47.0,<=5.0 to >=4.51.3,<=5.0 while maintaining backward compatibility
Missing test coverage for Qwen3 model support despite being a checklist item
No documentation updates in docs folder to reflect new model support
Poetry.lock updated but changes not included in the review context

_{1 file reviewed, no comments}
_{Edit PR Review Bot Settings | Greptile}

Abhi011999 · 2025-07-15T14:36:22Z

can we please merge this @michaelfeil
qwen3 is now the current SOTA

zengqingfu1442 · 2025-07-17T08:29:40Z

Only updating transformers can not solve the problem and cannot still deploy Qwen3-reranker-8B. see #611 (comment)

aryasaatvik · 2025-07-17T08:39:29Z

Only updating transformers can not solve the problem and cannot still deploy Qwen3-reranker-8B. see #611 (comment)

try using tomaarsen/Qwen3-Reranker-4B-seq-cls

zengqingfu1442 · 2025-07-17T08:45:31Z

Only updating transformers can not solve the problem and cannot still deploy Qwen3-reranker-8B. see #611 (comment)

try using tomaarsen/Qwen3-Reranker-4B-seq-cls

What is the difference between tomaarsen/Qwen3-Reranker-8B-seq-cls and official Qwen/Qwen3-reranker-8B? thanks.

Abhi011999 · 2025-07-17T11:09:34Z

The difference is mentioned here: https://huggingface.co/Qwen/Qwen3-Reranker-0.6B/discussions/3
They also implemented some hacky method to integrate with vllm: vllm-project/vllm#19260

Apparently, the original Qwen/Qwen3-Reranker-8B model's both embedder and reranker support Qwen3ForCausalLM architecture.

I don't think infinity has a way to support the reranker as of now through Qwen3ForSequenceClassification, they already implemented it in vllm. I am thinking if there's a way to support it here too.

aryasaatvik · 2025-07-17T11:27:37Z

The difference is mentioned here: huggingface.co/Qwen/Qwen3-Reranker-0.6B/discussions/3 They also implemented some hacky method to integrate with vllm: vllm-project/vllm#19260

Apparently, the original Qwen/Qwen3-Reranker-8B model's both embedder and reranker support Qwen3ForCausalLM architecture.

I don't think infinity has a way to support the reranker as of now through Qwen3ForSequenceClassification, they already implemented it in vllm. I am thinking if there's a way to support it here too.

I was able to get it working on modal https://gist.github.com/aryasaatvik/6f868d5f064b25e41cf1e20018721dca

zengqingfu1442 · 2025-07-18T03:39:26Z

The difference is mentioned here: https://huggingface.co/Qwen/Qwen3-Reranker-0.6B/discussions/3 They also implemented some hacky method to integrate with vllm: vllm-project/vllm#19260

Apparently, the original Qwen/Qwen3-Reranker-8B model's both embedder and reranker support Qwen3ForCausalLM architecture.

I don't think infinity has a way to support the reranker as of now through Qwen3ForSequenceClassification, they already implemented it in vllm. I am thinking if there's a way to support it here too.

The vllm 0.9.2 already support to deploy Qwen3-Reranker-8B and Qwen3-Embedding-8B.

michaelfeil · 2025-08-22T22:43:28Z

A update of the transformer libary will break all

onnx-optimum
bettertransformer dependencies

There is a large undertaking to migrate these libaries. Migrating bert from bettertransformer to e.g. torch + compile will lead to a 50% performance loss in throughput and latency.

chore: update transformers to support qwen3 architecture

4f7931e

greptile-apps bot reviewed Jul 10, 2025

View reviewed changes

aryasaatvik changed the title ~~Chore: Update transformers dependency to support Qwen3 architecture~~ chore: update transformers to support Qwen3 architecture Jul 10, 2025

michaelfeil closed this Aug 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: update transformers to support Qwen3 architecture #620

chore: update transformers to support Qwen3 architecture #620

Uh oh!

aryasaatvik commented Jul 10, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

Abhi011999 commented Jul 15, 2025

Uh oh!

zengqingfu1442 commented Jul 17, 2025

Uh oh!

aryasaatvik commented Jul 17, 2025

Uh oh!

zengqingfu1442 commented Jul 17, 2025

Uh oh!

Abhi011999 commented Jul 17, 2025

Uh oh!

aryasaatvik commented Jul 17, 2025

Uh oh!

zengqingfu1442 commented Jul 18, 2025

Uh oh!

michaelfeil commented Aug 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chore: update transformers to support Qwen3 architecture #620

chore: update transformers to support Qwen3 architecture #620

Uh oh!

Conversation

aryasaatvik commented Jul 10, 2025

Related Issue

Summary

Changes:

Why this is needed:

Checklist

Additional Notes

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

Uh oh!

Abhi011999 commented Jul 15, 2025

Uh oh!

zengqingfu1442 commented Jul 17, 2025

Uh oh!

aryasaatvik commented Jul 17, 2025

Uh oh!

zengqingfu1442 commented Jul 17, 2025

Uh oh!

Abhi011999 commented Jul 17, 2025

Uh oh!

aryasaatvik commented Jul 17, 2025

Uh oh!

zengqingfu1442 commented Jul 18, 2025

Uh oh!

michaelfeil commented Aug 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants