Commit Graph

14 Commits (2d642eea0f92c7f7c1fb7bef3abdfdb0cb61d1bf)

Author SHA1 Message Date
botbw 62cdac6b7b [chore] remove redundant test case, print string & reduce test tokens
4 months ago
hxwang cb01c0d5ce [moe] refactor mesh assignment
4 months ago
hxwang 70c9924d0d [chore] solve moe ckpt test failure and some other arg pass failure
4 months ago
hxwang 803878b2fd [moe] full test for deepseek and mixtral (pp + sp to fix)
4 months ago
hxwang 877d94bb8c [moe] init moe plugin comm setting with sp
4 months ago
Haze188 404b16faf3 [Feature] MoE Ulysses Support (#5918)
4 months ago
hxwang 3e2b6132b7 [moe] clean legacy code
4 months ago
hxwang 74eccac0db [moe] test deepseek
4 months ago
botbw dc583aa576 [moe] implement tp
4 months ago
hxwang 102b784a10 [chore] arg pass & remove drop token
4 months ago
botbw 9b9b76bdcd [moe] add mixtral dp grad scaling when not all experts are activated
4 months ago
botbw b5bfeb2efd [moe] implement transit between non moe tp and ep
4 months ago
hxwang 0b76b57cd6 [test] add mixtral transformer test
4 months ago
Haze188 416580b314
[MoE/ZeRO] Moe refactor with zero refactor (#5821)
5 months ago