mirror of https://github.com/InternLM/InternLM
* add local data parallel support for experts * fix model checkpoint for local dp mode of expert * do not set ep size from config |
||
|---|---|---|
| .. | ||
| communication | ||
| context | ||
| scheduler | ||
| __init__.py | ||
| engine.py | ||
| gradient_handler.py | ||
| naive_amp.py | ||
| trainer.py | ||