forked from huggingface/trl
-
Notifications
You must be signed in to change notification settings - Fork 0
Introduce TP in coloc mode #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
toslali-ibm
wants to merge
44
commits into
coloc
Choose a base branch
from
tpcoloc
base: coloc
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
5293d07
Introduce TP in coloc mode
toslali-ibm 7e6245e
Introduce sleep mode in colocated vllms
toslali-ibm 040b6e4
Fix process index bug
toslali-ibm 3aa300a
Ignore prefix cache for now
toslali-ibm 4e90aaa
Fix sleep issues
toslali-ibm 85d9309
Fix tp slices
toslali-ibm d2bccd3
Fix typo in wake up
toslali-ibm d876fe4
Debugging memory
toslali-ibm 712e85b
Measure memory during model update
toslali-ibm 5127b97
Debug memory reserved
toslali-ibm a8a6f69
Remove prints
toslali-ibm 5de5f3c
Validate via experiments
toslali-ibm cbb5ef3
Introduce flexible TP where TP may not be equal to world size
toslali-ibm c6b36e7
Remove prints
toslali-ibm cc3023c
Dont wakeup in pfix reset
toslali-ibm 8ae9f01
Tp size should divide global world size evenly
toslali-ibm 2de090f
Add max num seq
toslali-ibm 8151e11
Bring back sleep
toslali-ibm 946d49d
Fix sleep bug for grad accumulations
toslali-ibm fe8f684
Reload model during grad accumulation
toslali-ibm c3509de
Switch to sleep level 1
toslali-ibm aca6242
Sleep 1 during acc steps and levl 2 otherwise
toslali-ibm 110cbce
Fix config dfefinition
toslali-ibm 9e5128a
Debug generations
toslali-ibm 675a1ed
Revert to sleep level 1 - as level 2 generates randomly
toslali-ibm 91d7e72
Conduct 72b experiment
toslali-ibm 65666eb
Remove prints
toslali-ibm 710da69
Make sleep optional
toslali-ibm a426f59
Incorporate feedback
toslali-ibm 0db5719
Merge branch 'coloc' into tpcoloc
toslali-ibm a1dd8e4
Incorporate Fabians comments
toslali-ibm 2f95c00
Revert to sleep 2 and reload model during grad accumulation
toslali-ibm 9c15044
Include accelerator in vllm client to access deepspeed
toslali-ibm 3151828
Import deepspeed avaialble
toslali-ibm f44b0fe
Set seed in llm init
toslali-ibm 0bb8102
Parametrize sleep level 2 and compare
toslali-ibm 589ffc9
Debug level 1 and level 2
toslali-ibm 2eb1855
Fix grad accumulation for sleep 2
toslali-ibm 82d8d94
Fix grad accumulation for sleep 2
toslali-ibm e4fafee
Fix grad accumulation for sleep 2
toslali-ibm 6191b89
Revert tpcoloc branch to commit a1dd8e4 without rewriting history
toslali-ibm 717c8b4
Revert to sleep 2 after grad acc optimization fixing the model load e…
toslali-ibm 4badf5b
Comparison of sleep levels
toslali-ibm a0c677a
Add max_num_batched_tokens needed for v1 profiling
toslali-ibm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.