Skip to content

Conversation

@shawntan
Copy link
Contributor

Adds Granite model class to vLLM.

Model will be on Hugging Face once huggingface/transformers#31502 is merged.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

  • Comment /ready on the PR
  • Add ready label to the PR
  • Enable auto-merge.

🚀

@njhill
Copy link
Member

njhill commented Aug 12, 2024

Thanks @shawntan! Could you rebase on the latest main branch, and add a test in https://github.com/vllm-project/vllm/tree/main/tests/models similar to the other model architectures?

@njhill njhill marked this pull request as draft August 12, 2024 22:34
@njhill
Copy link
Member

njhill commented Aug 12, 2024

@shawntan I have moved this to draft since it currently depends on a future version of the transformers library.

How about including the GraniteConfig class in its own file here in the meantime? Then we can clean that up later once vLLM moves to the necessary transformers version.

@shawntan shawntan force-pushed the granite branch 2 times, most recently from b79bb1d to 19d622f Compare August 13, 2024 21:17
@njhill
Copy link
Member

njhill commented Aug 16, 2024

@shawntan to avoid duplication / maintenance overhead would it make more sense to just add the optional multipliers to llama.py?

@shawntan
Copy link
Contributor Author

@njhill HF PR merged.

Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shawntan, looks great.

However, we may need to keep the config class here for the time being, until a new transformers version is released containing it and vLLM moves to that version. Could you reinstate that?

And take it out of draft if it's now ready to be merged?

@njhill njhill marked this pull request as ready for review August 29, 2024 17:45
@njhill njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 29, 2024
@njhill njhill merged commit f8d6014 into vllm-project:main Sep 2, 2024
Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024
LeiWang1999 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Mar 26, 2025
Co-authored-by: Nick Hill <[email protected]>
Signed-off-by: LeiWang1999 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants