-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Feature/pipeline #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
FrankLeeeee
merged 8 commits into
hpcaitech:develop/experiments
from
ver217:feature/pipeline
Dec 4, 2021
Merged
Feature/pipeline #40
FrankLeeeee
merged 8 commits into
hpcaitech:develop/experiments
from
ver217:feature/pipeline
Dec 4, 2021
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Add gradient accumulation, fix lr scheduler * fix FP16 optimizer and adapted torch amp with tensor parallel (#18) * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes * fixed trainer * Revert "fixed trainer" This reverts commit 2e0b0b7. * improved consistency between trainer, engine and schedule (#23) Co-authored-by: 1SAA <[email protected]> Co-authored-by: 1SAA <[email protected]> Co-authored-by: ver217 <[email protected]>
FrankLeeeee
approved these changes
Dec 4, 2021
FrankLeeeee
added a commit
that referenced
this pull request
Dec 9, 2021
* remove redundancy func in setup (#19) (#20) * use env to control the language of doc (#24) (#25) * Support TP-compatible Torch AMP and Update trainer API (#27) * Add gradient accumulation, fix lr scheduler * fix FP16 optimizer and adapted torch amp with tensor parallel (#18) * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes * fixed trainer * Revert "fixed trainer" This reverts commit 2e0b0b7. * improved consistency between trainer, engine and schedule (#23) Co-authored-by: 1SAA <[email protected]> Co-authored-by: 1SAA <[email protected]> Co-authored-by: ver217 <[email protected]> * add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29) * add explanation for ViT example (#35) (#36) * optimize communication of pipeline parallel * fix grad clip for pipeline Co-authored-by: Frank Lee <[email protected]> Co-authored-by: 1SAA <[email protected]> Co-authored-by: binmakeswell <[email protected]>
FrankLeeeee
added a commit
that referenced
this pull request
Dec 9, 2021
* Add gradient accumulation, fix lr scheduler * fix FP16 optimizer and adapted torch amp with tensor parallel (#18) * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes * fixed trainer * Revert "fixed trainer" This reverts commit 2e0b0b7. * improved consistency between trainer, engine and schedule (#23) Co-authored-by: 1SAA <[email protected]> * Split conv2d, class token, positional embedding in 2d, Fix random number in ddp Fix convergence in cifar10, Imagenet1000 * Integrate 1d tensor parallel in Colossal-AI (#39) * fixed 1D and 2D convergence (#38) * optimized 2D operations * fixed 1D ViT convergence problem * Feature/ddp (#49) * remove redundancy func in setup (#19) (#20) * use env to control the language of doc (#24) (#25) * Support TP-compatible Torch AMP and Update trainer API (#27) * Add gradient accumulation, fix lr scheduler * fix FP16 optimizer and adapted torch amp with tensor parallel (#18) * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes * fixed trainer * Revert "fixed trainer" This reverts commit 2e0b0b7. * improved consistency between trainer, engine and schedule (#23) Co-authored-by: 1SAA <[email protected]> Co-authored-by: 1SAA <[email protected]> Co-authored-by: ver217 <[email protected]> * add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29) * add explanation for ViT example (#35) (#36) * support torch ddp * fix loss accumulation * add log for ddp * change seed * modify timing hook Co-authored-by: Frank Lee <[email protected]> Co-authored-by: 1SAA <[email protected]> Co-authored-by: binmakeswell <[email protected]> * Feature/pipeline (#40) * remove redundancy func in setup (#19) (#20) * use env to control the language of doc (#24) (#25) * Support TP-compatible Torch AMP and Update trainer API (#27) * Add gradient accumulation, fix lr scheduler * fix FP16 optimizer and adapted torch amp with tensor parallel (#18) * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes * fixed trainer * Revert "fixed trainer" This reverts commit 2e0b0b7. * improved consistency between trainer, engine and schedule (#23) Co-authored-by: 1SAA <[email protected]> Co-authored-by: 1SAA <[email protected]> Co-authored-by: ver217 <[email protected]> * add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29) * add explanation for ViT example (#35) (#36) * optimize communication of pipeline parallel * fix grad clip for pipeline Co-authored-by: Frank Lee <[email protected]> Co-authored-by: 1SAA <[email protected]> Co-authored-by: binmakeswell <[email protected]> * optimized 3d layer to fix slow computation ; tested imagenet performance with 3d; reworked lr_scheduler config definition; fixed launch args; fixed some printing issues; simplified apis of 3d layers (#51) * Update 2.5d layer code to get a similar accuracy on imagenet-1k dataset * update api for better usability (#58) update api for better usability Co-authored-by: 1SAA <[email protected]> Co-authored-by: ver217 <[email protected]> Co-authored-by: puck_WCR <[email protected]> Co-authored-by: binmakeswell <[email protected]> Co-authored-by: アマデウス <[email protected]> Co-authored-by: BoxiangW <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Optimize communication of pipeline. Use P2POP instead of broadcast, and use async operation when sychronizing data over stages.