Commit Graph

13 Commits (52d346f2a53c08c18a738ef68aad194f95f37af2)

Author SHA1 Message Date
hxwang 803878b2fd [moe] full test for deepseek and mixtral (pp + sp to fix) 2024-08-01 10:06:59 +08:00
haze188 2cddeac717 moe sp + ep bug fix 2024-08-01 10:06:59 +08:00
hxwang 877d94bb8c [moe] init moe plugin comm setting with sp 2024-08-01 10:06:59 +08:00
hxwang 09d6280d3e [chore] minor fix 2024-08-01 10:06:59 +08:00
Haze188 404b16faf3 [Feature] MoE Ulysses Support (#5918)
* moe sp support

* moe sp bug solve

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-08-01 10:06:59 +08:00
botbw e28e05345b [moe] implement submesh initialization 2024-08-01 10:06:59 +08:00
haze188 5ed5e8cfba solve hang when parallel mode = pp + dp 2024-08-01 10:06:59 +08:00
botbw 13b48ac0aa [zero] solve hang 2024-08-01 10:06:59 +08:00
botbw b5bfeb2efd [moe] implement transit between non moe tp and ep 2024-08-01 10:06:59 +08:00
botbw 37443cc7e4 [test] pass mixtral shardformer test 2024-08-01 10:06:59 +08:00
hxwang 46c069b0db [zero] solve hang 2024-08-01 10:06:59 +08:00
hxwang a249e71946 [test] mixtra pp shard test 2024-08-01 10:06:59 +08:00
hxwang 0b76b57cd6 [test] add mixtral transformer test 2024-08-01 10:06:59 +08:00