Jiarui Fang
193dc8dacb
[refactor] refactor the memory utils ( #715 )
3 years ago
HELSON
d7ecaf362b
[zero] fix init bugs in zero context ( #686 )
...
* adapt model weight initialization for methods in Pytorch nn.init
3 years ago
Jiarui Fang
53b1b6e340
[zero] non model data tracing ( #545 )
3 years ago
ver217
1f90a3b129
[zero] polish ZeroInitContext ( #540 )
3 years ago
Jiarui Fang
c11ff81b15
[zero] get memory usage of sharded optim v2. ( #542 )
3 years ago
HELSON
a30e2b4c24
[zero] adapt for no-leaf module in zero ( #535 )
...
only process module's own parameters in Zero context
add zero hooks for all modules that contrain parameters
gather parameters only belonging to module itself
3 years ago
Frank Lee
3601b2bad0
[test] fixed rerun_on_exception and adapted test cases ( #487 )
3 years ago
ver217
9ec1ce6ab1
[zero] sharded model support the reuse of fp16 shard ( #495 )
...
* sharded model supports reuse fp16 shard
* rename variable
* polish code
* polish code
* polish code
3 years ago
ver217
62b0a8d644
[zero] sharded optim support hybrid cpu adam ( #486 )
...
* sharded optim support hybrid cpu adam
* update unit test
* polish docstring
3 years ago
Frank Lee
af185b5519
[test] fixed amp convergence comparison test ( #454 )
3 years ago
ver217
642846d6f9
update sharded optim and fix zero init ctx ( #457 )
3 years ago
Jiarui Fang
e2e9f82588
Revert "[zero] update sharded optim and fix zero init ctx" ( #456 )
...
* Revert "polish code"
This reverts commit 8cf7ff08cf
.
* Revert "rename variables"
This reverts commit e99af94ab8
.
* Revert "remove surplus imports"
This reverts commit 46add4a5c5
.
* Revert "update sharded optim and fix zero init ctx"
This reverts commit 57567ee768
.
3 years ago
ver217
8cf7ff08cf
polish code
3 years ago
ver217
57567ee768
update sharded optim and fix zero init ctx
3 years ago
Frank Lee
f27d801a13
[test] optimized zero data parallel test ( #452 )
3 years ago
Jiarui Fang
0fcfb1e00d
[test] make zero engine test really work ( #447 )
3 years ago
Jiarui Fang
f9c762df85
[test] merge zero optim tests ( #428 )
3 years ago
Jiarui Fang
adebb3e041
[zero] cuda margin space for OS ( #418 )
3 years ago
Jiarui Fang
23ba3fc450
[zero] refactory ShardedOptimV2 init method ( #416 )
3 years ago
Jiarui Fang
21dc54e019
[zero] memtracer to record cuda memory usage of model data and overall system ( #395 )
3 years ago
Jiarui Fang
370f567e7d
[zero] new interface for ShardedOptimv2 ( #406 )
3 years ago
ver217
54fd37f0e0
polish unit test
3 years ago
Jiarui Fang
3af13a2c3e
[zero] polish ShardedOptimV2 unittest ( #385 )
...
* place params on cpu after zero init context
* polish code
* bucketzed cpu gpu tensor transter
* find a bug in sharded optim unittest
* add offload unittest for ShardedOptimV2.
* polish code and make it more robust
3 years ago
ver217
d0ae0f2215
[zero] update sharded optim v2 ( #334 )
3 years ago
ver217
1388671699
[zero] Update sharded model v2 using sharded param v2 ( #323 )
3 years ago
ver217
36f9a74ab2
fix sharded param hook and unit test
3 years ago
ver217
001ca624dd
impl shard optim v2 and add unit test
3 years ago