Commit Graph

41 Commits (7be397ca9c3cb67bff614e4bd44e836d22a693d0)

Author SHA1 Message Date
Jiarui Fang a445e118cf
[polish] polish singleton and global context (#500)
3 years ago
Jiarui Fang b334822163
[zero] polish sharded param name (#484)
3 years ago
Jiarui Fang 65c0f380c2
[format] polish name format for MOE (#481)
3 years ago
ver217 8d3250d74b
[zero] ZeRO supports pipeline parallel (#477)
3 years ago
HELSON aff9d354f7
[MOE] polish moe_env (#467)
3 years ago
HELSON 84fd7c1d4d
add moe context, moe utilities and refactor gradient handler (#455)
3 years ago
ver217 a241f61b34
[zero] Update initialize for ZeRO (#458)
3 years ago
ver217 9506a8beb2 use double buffer to handle grad
3 years ago
Jiarui Fang 56bb412e72
[polish] use GLOBAL_MODEL_DATA_TRACER (#417)
3 years ago
Jiarui Fang 21dc54e019
[zero] memtracer to record cuda memory usage of model data and overall system (#395)
3 years ago
ver217 88804aee49 add bucket tensor shard strategy
3 years ago
Xu Kai 54ee8d1254 Fix/format colossalai/engine/paramhooks/(#350)
3 years ago
yuxuan-lou 3b88eb2259 Flake8 code restyle
3 years ago
Jiarui Fang 44e4891f57 [zero] able to place params on cpu after zero init context (#365)
3 years ago
Jiarui Fang 10e2826426 move async memory to an individual directory (#345)
3 years ago
Frank Lee 6a3188167c set criterion as optional in colossalai initialize (#336)
3 years ago
Jie Zhu 3213554cc2 [profiler] add adaptive sampling to memory profiler (#330)
3 years ago
ver217 1388671699 [zero] Update sharded model v2 using sharded param v2 (#323)
3 years ago
Jiarui Fang 11bddb6e55 [zero] update zero context init with the updated test utils (#327)
3 years ago
ver217 36f9a74ab2 fix sharded param hook and unit test
3 years ago
ver217 001ca624dd impl shard optim v2 and add unit test
3 years ago
Jie Zhu d344689274 [profiler] primary memory tracer
3 years ago
ver217 7aef75ca42 [zero] add sharded grad and refactor grad hooks for ShardedModel (#287)
3 years ago
Jiarui Fang 8d653af408 add a common util for hooks registered on parameter. (#292)
3 years ago
Jiarui Fang 5a560a060a Feature/zero (#279)
3 years ago
アマデウス 9ee197d0e9 moved env variables to global variables; (#215)
3 years ago
Jiarui Fang 569357fea0
add pytorch hooks (#179)
3 years ago
ver217 708404d5f8
fix pipeline forward return tensors (#176)
3 years ago
HELSON 0f8c7f9804
Fixed docstring in colossalai (#171)
3 years ago
Frank Lee e2089c5c15
adapted for sequence parallel (#163)
3 years ago
ver217 7bf1e98b97
pipeline last stage supports multi output (#151)
3 years ago
HELSON dceae85195
Added MoE parallel (#127)
3 years ago
ver217 293fb40c42
add scatter/gather optim for pipeline (#123)
3 years ago
ver217 7904baf6e1
fix layers/schedule for hybrid parallelization (#111) (#112)
3 years ago
ver217 96780e6ee4
Optimize pipeline schedule (#94)
3 years ago
アマデウス 01a80cd86d
Hotfix/Colossalai layers (#92)
3 years ago
ver217 8f02a88db2
add interleaved pipeline, fix naive amp and update pipeline model initializer (#80)
3 years ago
Frank Lee 35813ed3c4
update examples and sphnix docs for the new api (#63)
3 years ago
Frank Lee da01c234e1
Develop/experiments (#59)
3 years ago
Frank Lee 3defa32aee
Support TP-compatible Torch AMP and Update trainer API (#27)
3 years ago
zbian 404ecbdcc6 Migrated project
3 years ago