Commit Graph

36 Commits (85e045b063a70cd36ccc0405acc245d86f2a1621)

Author SHA1 Message Date
Kirigaya Kazuto e9460b45c8
[engin/schedule] use p2p_v2 to recontruct pipeline_schedule (#1408)
2 years ago
Jiarui Fang 4165eabb1e
[hotfix] remove potiential circle import (#1307)
2 years ago
YuliangLiu0306 17ed33350b
[hotfix] fix an assertion bug in base schedule. (#1250)
2 years ago
YuliangLiu0306 f1f51990b9
[hotfix]fix some bugs caused by refactored schedule. (#1148)
2 years ago
YuliangLiu0306 18091581c0
[pipeline]support more flexible pipeline (#1138)
2 years ago
YuliangLiu0306 946dbd629d
[hotfix]fix bugs caused by refactored pipeline (#1133)
2 years ago
YuliangLiu0306 3175bcb4d8
[pipeline]support List of Dict data (#1125)
2 years ago
Frank Lee 6f82ac9bcb
[pipeline] supported more flexible dataflow control for pipeline parallel training (#1108)
2 years ago
YuliangLiu0306 1e9f9c227f
[hotfix]change to fit latest p2p (#1100)
2 years ago
YuliangLiu0306 b167258b6a
[pipeline]refactor ppschedule to support tensor list (#1050)
3 years ago
YuliangLiu0306 32a45cd7ef
[pipelinable]use pipelinable to support GPT model. (#903)
3 years ago
Frank Lee 11f54c7b6b
[doc] improved docstring and assertion messages for the engine module (#871)
3 years ago
Jiarui Fang 681addb512
[refactor] moving grad acc logic to engine (#804)
3 years ago
Jiarui Fang 4d9332b4c5
[refactor] moving memtracer to gemini (#801)
3 years ago
YuliangLiu0306 0ed7042f42
[pipeline] refactor pipeline (#679)
3 years ago
yuxuan-lou cfb41297ff 'fix/format' (#573)
3 years ago
YuliangLiu0306 ade05a5d83
[refactor] pipeline, put runtime schedule into engine. (#627)
3 years ago
Liang Bowen ec5086c49c Refactored docstring to google style
3 years ago
Jiarui Fang 4d322b79da
[refactor] remove old zero code (#517)
3 years ago
ver217 8d3250d74b
[zero] ZeRO supports pipeline parallel (#477)
3 years ago
Jiarui Fang 5a560a060a Feature/zero (#279)
3 years ago
アマデウス 9ee197d0e9 moved env variables to global variables; (#215)
3 years ago
ver217 708404d5f8
fix pipeline forward return tensors (#176)
3 years ago
HELSON 0f8c7f9804
Fixed docstring in colossalai (#171)
3 years ago
Frank Lee e2089c5c15
adapted for sequence parallel (#163)
3 years ago
ver217 7bf1e98b97
pipeline last stage supports multi output (#151)
3 years ago
HELSON dceae85195
Added MoE parallel (#127)
3 years ago
ver217 293fb40c42
add scatter/gather optim for pipeline (#123)
3 years ago
ver217 7904baf6e1
fix layers/schedule for hybrid parallelization (#111) (#112)
3 years ago
ver217 96780e6ee4
Optimize pipeline schedule (#94)
3 years ago
アマデウス 01a80cd86d
Hotfix/Colossalai layers (#92)
3 years ago
ver217 8f02a88db2
add interleaved pipeline, fix naive amp and update pipeline model initializer (#80)
3 years ago
Frank Lee 35813ed3c4
update examples and sphnix docs for the new api (#63)
3 years ago
Frank Lee da01c234e1
Develop/experiments (#59)
3 years ago
Frank Lee 3defa32aee
Support TP-compatible Torch AMP and Update trainer API (#27)
3 years ago
zbian 404ecbdcc6 Migrated project
3 years ago