The model to consider.
Qwen2 series, in https://huggingface.co/Qwen/Qwen2-7B
The closest model vllm already supports.
Qwen1.5 series
What's your difficulty of supporting the model you want?
Current version (v0.5.0-post1)didn't have the kernel to run Qwen2 series.
The intermediate size of all qwen2 models is not supported by punica yet.
And now the problem have been solved by #5441
I'm glad to see that the code is bumped and a new version will be introduced in the next days!