mirror of https://github.com/InternLM/InternLM
* add zero broadcast_sync * delete old sync logic * fix merged error * refactor code * remove some unused function (is norm/gate group) |
||
|---|---|---|
| .. | ||
| experts.py | ||
| sharded_moe.py | ||