Jiarui Fang
23ba3fc450
[zero] refactory ShardedOptimV2 init method ( #416 )
2022-03-15 10:45:55 +08:00
Jiarui Fang
21dc54e019
[zero] memtracer to record cuda memory usage of model data and overall system ( #395 )
2022-03-14 22:05:30 +08:00
Jiarui Fang
370f567e7d
[zero] new interface for ShardedOptimv2 ( #406 )
2022-03-14 20:48:41 +08:00
ver217
54fd37f0e0
polish unit test
2022-03-14 15:06:02 +08:00
Jiarui Fang
3af13a2c3e
[zero] polish ShardedOptimV2 unittest ( #385 )
...
* place params on cpu after zero init context
* polish code
* bucketzed cpu gpu tensor transter
* find a bug in sharded optim unittest
* add offload unittest for ShardedOptimV2.
* polish code and make it more robust
2022-03-11 15:50:28 +08:00
ver217
d0ae0f2215
[zero] update sharded optim v2 ( #334 )
2022-03-11 15:50:28 +08:00
ver217
1388671699
[zero] Update sharded model v2 using sharded param v2 ( #323 )
2022-03-11 15:50:28 +08:00
ver217
36f9a74ab2
fix sharded param hook and unit test
2022-03-11 15:50:28 +08:00
ver217
001ca624dd
impl shard optim v2 and add unit test
2022-03-11 15:50:28 +08:00