ColossalAI/docs/source
Baizhou Zhang 21ba89cab6
[gemini] support gradient accumulation (#4869)
* add test

* fix no_sync bug in low level zero plugin

* fix test

* add argument for grad accum

* add grad accum in backward hook for gemini

* finish implementation, rewrite tests

* fix test

* skip stuck model in low level zero test

* update doc

* optimize communication & fix gradient checkpoint

* modify doc

* cleaning codes

* update cpu adam fp16 case
2023-10-17 14:07:21 +08:00
..
en [gemini] support gradient accumulation (#4869) 2023-10-17 14:07:21 +08:00
zh-Hans [gemini] support gradient accumulation (#4869) 2023-10-17 14:07:21 +08:00