ColossalAI/colossalai/nn/parallel
HELSON 63fbba3c19
[zero] add L2 gradient clipping for ZeRO (#2112)
* [zero] add L2 gradient clipping

* [testing] add MlpModel

* [zero] add unit test for grad clipping

* fix atol
2022-12-09 18:09:17 +08:00
..
layers [embedding] rename FreqAwareEmbedding -> CachedEmbedding (#1699) 2022-10-13 22:22:27 +08:00
__init__.py [Gemini] make gemini usage simple (#1821) 2022-11-08 15:53:13 +08:00
data_parallel.py [zero] add L2 gradient clipping for ZeRO (#2112) 2022-12-09 18:09:17 +08:00
gemini_parallel.py [Gemini] remove static tracer (#2083) 2022-12-06 12:53:58 +08:00
reducer.py [ddp] ColoDDP uses bucket all-reduce (#1177) 2022-06-29 10:34:13 +08:00
utils.py [feature] A new ZeRO implementation (#1644) 2022-10-09 09:18:51 +08:00