-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
[Easy][Model Registry] Add Llama4ForCausalLM in model registry #19580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Easy][Model Registry] Add Llama4ForCausalLM in model registry #19580
Conversation
Signed-off-by: Zijing Liu <[email protected]>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
pre-commit runs fine locally: I will wait and see if there is any CI failure signals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put it on hold, and just check the CI.
| "LlamaForCausalLM": ("llama", "LlamaForCausalLM"), | ||
| # For decapoda-research/llama-* | ||
| "LLaMAForCausalLM": ("llama", "LlamaForCausalLM"), | ||
| "Llama4ForCausalLM": ("llama4", "Llama4ForCausalLM"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently our basic-models-test always assumes that the tested architectures have a corresponding huggingface model repository to test with.
Lines 440 to 447 in ace5cda
| _EXAMPLE_MODELS = { | |
| **_TEXT_GENERATION_EXAMPLE_MODELS, | |
| **_EMBEDDING_EXAMPLE_MODELS, | |
| **_CROSS_ENCODER_EXAMPLE_MODELS, | |
| **_MULTIMODAL_EXAMPLE_MODELS, | |
| **_SPECULATIVE_DECODING_EXAMPLE_MODELS, | |
| **_TRANSFORMERS_MODELS, | |
| } |
Do you think it's possible to add a dummy model repo on HF with the architecture Llama4ForCausalLM? Alternatively you will need to modify test_registry.py for CI to pass.
| "LlamaForCausalLM": ("llama", "LlamaForCausalLM"), | ||
| # For decapoda-research/llama-* | ||
| "LLaMAForCausalLM": ("llama", "LlamaForCausalLM"), | ||
| "Llama4ForCausalLM": ("llama4", "Llama4ForCausalLM"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a related note, I think the proper way to support the text-only usage of models that are released as "natively multimodal" like llama4 or mistral-small 3.1 is to add a --language-model-only mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should just go with "--language-model-only" solution? @liuzijing2014 thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I will try out this idea for Llama4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liuzijing2014 Happy to collaborate on this! This was one of the items that I'm planning to work on too :)
Purpose
Allow vLLM to run text-only Llama4 model, aka
Llama4ForCausalLM.Test Plan
Run vLLM w/ a text-only Llama4 Maverick checkpoint (vender internal one).
Test Result
Model successfully recognized and loaded