Commit Graph

12 Commits (e79ea44247bbb6457cd2a2c30454208017fd31ef)

Author SHA1 Message Date
Frank Lee e79ea44247
[fp16] refactored fp16 optimizer (#392) 2022-03-15 10:05:38 +08:00
Jiarui Fang 21dc54e019
[zero] memtracer to record cuda memory usage of model data and overall system (#395) 2022-03-14 22:05:30 +08:00
Jiarui Fang 370f567e7d
[zero] new interface for ShardedOptimv2 (#406) 2022-03-14 20:48:41 +08:00
Jiarui Fang 3af13a2c3e [zero] polish ShardedOptimV2 unittest (#385)
* place params on cpu after zero init context

* polish code

* bucketzed cpu gpu tensor transter

* find a bug in sharded optim unittest

* add offload unittest for ShardedOptimV2.

* polish code and make it more robust
2022-03-11 15:50:28 +08:00
Jiarui Fang b5f43acee3 [zero] find miss code (#378) 2022-03-11 15:50:28 +08:00
jiaruifang d9217e1960 Revert "[zero] bucketized tensor cpu gpu copy (#368)"
This reverts commit bef05489b6.
2022-03-11 15:50:28 +08:00
Jiarui Fang 00670c870e [zero] bucketized tensor cpu gpu copy (#368) 2022-03-11 15:50:28 +08:00
ver217 d0ae0f2215 [zero] update sharded optim v2 (#334) 2022-03-11 15:50:28 +08:00
ver217 3092317b80 polish code 2022-03-11 15:50:28 +08:00
ver217 36f9a74ab2 fix sharded param hook and unit test 2022-03-11 15:50:28 +08:00
ver217 001ca624dd impl shard optim v2 and add unit test 2022-03-11 15:50:28 +08:00
ver217 b105371ace rename shared adam to sharded optim v2 2022-03-11 15:50:28 +08:00