ColossalAI/colossalai/nn/layer/moe
HELSON e6d50ec107
[zero] adapt zero for unsharded parameters (#561)
* support existing sharded and unsharded parameters in zero

* add unitest for moe-zero model init

* polish moe gradient handler
2022-03-31 18:34:11 +08:00
..
__init__.py [MOE] support PR-MOE (#488) 2022-03-22 16:48:22 +08:00
_operation.py [MOE] polish moe_env (#467) 2022-03-19 15:36:25 +08:00
experts.py [zero] adapt zero for unsharded parameters (#561) 2022-03-31 18:34:11 +08:00
layers.py [zero] adapt zero for unsharded parameters (#561) 2022-03-31 18:34:11 +08:00
utils.py [zero] adapt zero for unsharded parameters (#561) 2022-03-31 18:34:11 +08:00