Skip to content

Conversation

@WoosukKwon
Copy link
Collaborator

Should be merged after #53

This PR adds support for the bfloat16 data type, which is used for some LLMs including Dolly V2.

@WoosukKwon WoosukKwon merged commit e070829 into main May 3, 2023
@WoosukKwon WoosukKwon deleted the support-bfloat16 branch May 3, 2023 21:09
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
SUMMARY
* `yapf` format a couple of test files

TEST PLAN:
ran `yapf` in-place locally to get the files updated.
dllehr-amd pushed a commit to dllehr-amd/vllm that referenced this pull request Jul 22, 2024
* adds wvSpltK optimization for skinny gemm.


---------

Co-authored-by: Hashem Hashemi <[email protected]>
JHLEE17 pushed a commit to JHLEE17/vllm that referenced this pull request Aug 1, 2024
@alixiaodi alixiaodi mentioned this pull request Aug 2, 2024
heheda12345 pushed a commit to heheda12345/vllm that referenced this pull request Sep 29, 2025
* Pad flashmla_sparse to 128 on blackwell

* adjust get_max_prefill_buffer_size

* change comments
amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Oct 10, 2025
… API (vllm-project#54)

* use rpc to bypass openAI API

Signed-off-by: wuhang <[email protected]>

* example run

Signed-off-by: wuhang <[email protected]>

---------

Signed-off-by: wuhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant