HELSON
a9b8300d54
[zero] improve adaptability for not-shard parameters ( #708 )
...
* adapt post grad hooks for not-shard parameters
* adapt optimizer for not-shard parameters
* offload gradients for not-replicated parameters
2022-04-11 13:38:51 +08:00
HELSON
b31daed4cf
fix bugs in CPU adam ( #633 )
...
* add cpu adam counter for all cpu adam
* fixed updating error in adam kernel
2022-04-02 17:04:05 +08:00
ver217
e619a651fb
polish optimizer docstring ( #619 )
2022-04-01 16:27:03 +08:00
LuGY
c44d797072
[docs] updatad docs of hybrid adam and cpu adam ( #552 )
2022-03-30 18:14:59 +08:00
LuGY
105c5301c3
[zero]added hybrid adam, removed loss scale in adam ( #527 )
...
* [zero]added hybrid adam, removed loss scale of adam
* remove useless code
2022-03-25 18:03:54 +08:00
ver217
9ec1ce6ab1
[zero] sharded model support the reuse of fp16 shard ( #495 )
...
* sharded model supports reuse fp16 shard
* rename variable
* polish code
* polish code
* polish code
2022-03-23 14:59:59 +08:00
ver217
62b0a8d644
[zero] sharded optim support hybrid cpu adam ( #486 )
...
* sharded optim support hybrid cpu adam
* update unit test
* polish docstring
2022-03-22 14:56:59 +08:00
Jiarui Fang
0fcfb1e00d
[test] make zero engine test really work ( #447 )
2022-03-17 17:24:25 +08:00
Jiarui Fang
237d08e7ee
[zero] hybrid cpu adam ( #445 )
2022-03-17 15:05:41 +08:00
Kai Wang (Victor Kai)
53bb3bcc0a
fix format ( #362 )
2022-03-11 15:50:28 +08:00
LuGY
a3269de5c9
[zero] cpu adam kernel ( #288 )
...
* Added CPU Adam
* finished the cpu adam
* updated the license
* delete useless parameters, removed resnet
* modified the method off cpu adam unittest
* deleted some useless codes
* removed useless codes
Co-authored-by: ver217 <lhx0217@gmail.com>
Co-authored-by: Frank Lee <somerlee.9@gmail.com>
Co-authored-by: jiaruifang <fangjiarui123@gmail.com>
2022-03-11 15:50:28 +08:00