Commit Graph

3 Commits (91f84f6a5f8cbcf929f4656baf29190216393e37)

Author SHA1 Message Date
hxwang 05a78d2f41
[chore] solve moe ckpt test failure and some other arg pass failure 2024-07-22 03:53:02 +00:00
hxwang 8d3d7f3cbd
[moe] test deepseek 2024-07-19 07:32:00 +00:00
botbw 1b15cc97f5
[moe] add mixtral dp grad scaling when not all experts are activated 2024-07-19 07:30:14 +00:00