-
-
Notifications
You must be signed in to change notification settings - Fork 11.4k
[2/N] Chunked prefill data update #3538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
simon-mo
merged 127 commits into
vllm-project:main
from
rkooo567:chunked-prefill-scheduler-data-update
Mar 28, 2024
Merged
Changes from all commits
Commits
Show all changes
127 commits
Select commit
Hold shift + click to select a range
06fe872
[1/n] Support efficient reshape caching.
rkooo567 9a0b6be
[2/n] support flash attention kernel
rkooo567 6947167
oss flash attention works
rkooo567 4769a26
in progress
rkooo567 963db44
flash attn enabled.
rkooo567 2b9c36b
ip
rkooo567 2c1bb6c
support every model
rkooo567 2bb5e62
Fixed broken tests.
rkooo567 4d6a05f
[2/n] scheduler changes
rkooo567 0831f84
[2/n] ip
rkooo567 f31371f
[2/n]ip
rkooo567 78bb887
ip
rkooo567 b9d93c5
Merge branch 'chunked-prefill-3' into chunked-prefill-scheduler
rkooo567 42dd362
[2/n] ip
rkooo567 74ac900
seems to work.
rkooo567 e3afc25
Merge branch 'chunked-prefill-3' into chunked-prefill-scheduler
rkooo567 6141885
[2/n] ip
rkooo567 71bdada
.
rkooo567 d4c3b5d
ip?
rkooo567 baef7c6
block tables updated correctly
rkooo567 d503a22
Merge branch 'chunked-prefill-3' into chunked-prefill-scheduler
rkooo567 a12ec68
hopefully tests pass
rkooo567 85760db
Merge branch 'chunked-prefill-3' into chunked-prefill-scheduler
rkooo567 e40bc45
[2/n] update sequence data
rkooo567 d85670f
[2/n] add prefill range apis
rkooo567 0d8785f
Merge branch 'main' into chunked-prefill-3
rkooo567 08c8541
.
rkooo567 3bac9af
ip
rkooo567 0ca1284
add data.
rkooo567 2487bda
ip
rkooo567 81151e8
ip
rkooo567 31aa920
ip
rkooo567 2049b35
.
rkooo567 ef679d7
.
rkooo567 71bda97
.
rkooo567 4e00e7f
done?
rkooo567 c5f3a0d
Merge branch 'chunked-prefill-3' into chunked-prefill-scheduler
rkooo567 7fd70f2
Merge branch 'main' into chunked-prefill-3
rkooo567 9bbb04e
Merge branch 'chunked-prefill-3' into chunked-prefill-scheduler-data-…
rkooo567 9177d54
Merge branch 'main' into chunked-prefill-3
rkooo567 5e47c1e
Merge branch 'chunked-prefill-3' into chunked-prefill-scheduler-data-…
rkooo567 c0384a4
Refactor 2d query to 1d query
rkooo567 6032edf
.,
rkooo567 c1ab0b0
done
rkooo567 f48dc72
Addressed code review.
rkooo567 769b2b4
working
rkooo567 4a20f4a
Merge branch 'main' into 1dquery
rkooo567 f7347b8
working
rkooo567 d931725
Merge branch 'main' into 1dquery
rkooo567 f91d73e
fix lora
rkooo567 f7d79da
fixed
rkooo567 851c018
Merge branch 'main' into 1dquery
rkooo567 406f1d4
fix
rkooo567 c66ec36
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 c067a4c
working.
rkooo567 e1f244a
clean up.
rkooo567 d09eaf5
.
rkooo567 4a8ab3c
Merge branch 'main' into chunked-prefill-scheduler-data-update
rkooo567 a08e65e
Merge branch 'main' into 1dquery
rkooo567 d9532f8
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 93a7b90
.
rkooo567 b4b94c6
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 647d8cc
.
rkooo567 65ac6ce
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 b2f4b3e
ip
rkooo567 cc8419f
.
rkooo567 76e7ca8
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 d3d0336
Merge branch 'main' into 1dquery
rkooo567 11ec167
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 3cb8093
ip addressing comments.
rkooo567 5391129
Alibi slopes working now.
rkooo567 6b04443
Merge branch 'main' into 1dquery
rkooo567 fe344f6
add new fieflds
rkooo567 e619c4e
Flash attn works now
rkooo567 9c86aa3
Linting
rkooo567 5b4aa09
temporary
rkooo567 03dd155
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 4cced78
fix tests
rkooo567 cdb7a2c
Fixed
rkooo567 276be06
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 d87b651
Pass unit tests.
rkooo567 2c18896
experiment
rkooo567 b46f902
.
rkooo567 07b22f8
.
rkooo567 9bd7ea1
.
rkooo567 c55402f
trial
rkooo567 a13cf7e
remove --fork
rkooo567 c5c5581
Merge branch 'main' into 1dquery
rkooo567 ec91304
fixed
rkooo567 4977e53
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 4a54688
Merge branch 'main' into 1dquery
rkooo567 2e6e919
Addressed code review.
rkooo567 1f6f6b0
Merge branch 'main' into 1dquery
rkooo567 ac7828c
revert removing forked
rkooo567 3d7f1a1
done
rkooo567 bcdd74a
Merge branch 'main' into 1dquery
rkooo567 fa3ce4e
final code review.
rkooo567 a83b235
Merge branch '1dquery' into chunked-prefill-scheduler-data-update
rkooo567 7205ef9
Merge branch 'main' into chunked-prefill-scheduler-data-update
rkooo567 8bc0af5
.
rkooo567 97bcb6f
ip
rkooo567 df34350
working except tests.
rkooo567 e70e03d
.
rkooo567 f89f428
ip
rkooo567 bf02f8e
done
rkooo567 ad43095
done
rkooo567 16b6196
Addressed code review.
rkooo567 916abc8
merge conflict fixed
rkooo567 5002e61
update
rkooo567 80f51ea
test fix
rkooo567 3cc5e99
Merge branch 'main' into chunked-prefill-scheduler-data-update
rkooo567 fa7ba35
lint
rkooo567 51cf7f2
fix broken tests.
rkooo567 cdee1c6
.
rkooo567 16e3a7d
done
rkooo567 e0d301c
remove num chunked prefill from seq group metadata
rkooo567 5e0f87e
change apis
rkooo567 6e72648
cleaned
rkooo567 4f869be
now working
rkooo567 4f63c57
update with new apis
rkooo567 5c3abf4
working!
rkooo567 66f3fcf
fixed
rkooo567 9c12d8e
Merge branch 'main' into chunked-prefill-scheduler-data-update
rkooo567 9d4b65c
Addressed code review.
rkooo567 54a58b2
Merge branch 'main' into chunked-prefill-scheduler-data-update
rkooo567 9bdb9dc
fix tests.
rkooo567 88126a9
fixed a bug
rkooo567 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.