InternLM/internlm/train
Wenwen Qu 21624f6f81
fix(moe): remove norm&gate force sync (#448)
* add zero broadcast_sync

* delete old sync logic

* fix merged error

* refactor code

* remove some unused function (is norm/gate group)
2023-11-01 11:29:55 +08:00
..
__init__.py feat(train): add fsdp training option (#293) 2023-10-09 18:59:31 +08:00
training_internlm.py feat(optimizer): zero gradient count (#449) 2023-10-27 16:26:55 +08:00
utils.py fix(moe): remove norm&gate force sync (#448) 2023-11-01 11:29:55 +08:00