Commit Graph

21 Commits (19d1510ea26d10484a804eb62f6d03dbcc7c80a8)

Author SHA1 Message Date
hxwang 74b03de3f9 [moe] remove ops
4 months ago
hxwang 803878b2fd [moe] full test for deepseek and mixtral (pp + sp to fix)
4 months ago
hxwang 3e2b6132b7 [moe] clean legacy code
4 months ago
botbw dc583aa576 [moe] implement tp
4 months ago
botbw 9b9b76bdcd [moe] add mixtral dp grad scaling when not all experts are activated
4 months ago
botbw b5bfeb2efd [moe] implement transit between non moe tp and ep
4 months ago
hxwang 46c069b0db [zero] solve hang
4 months ago
Haze188 416580b314
[MoE/ZeRO] Moe refactor with zero refactor (#5821)
5 months ago
digger yu 5e1c93d732
[hotfix] fix typo change MoECheckpintIO to MoECheckpointIO (#5335)
9 months ago
ver217 06db94fbc9 [moe] fix tests
10 months ago
Hongxin Liu da39d21b71 [moe] support mixtral (#5309)
10 months ago
Hongxin Liu c904d2ae99 [moe] update capacity computing (#5253)
10 months ago
Xuanlei Zhao 7d8e0338a4 [moe] init mixtral impl
10 months ago
Frank Lee 8823cc4831
Merge pull request #5310 from hpcaitech/feature/npu
10 months ago
Frank Lee 7cfed5f076
[feat] refactored extension module (#5298)
10 months ago
digger yu bce9499ed3
fix some typo (#5307)
10 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
Wenhao Chen 3c08f17348
[hotfix]: modify create_ep_hierarchical_group and add test (#5032)
1 year ago
Wenhao Chen 724441279b
[moe]: fix ep/tp tests, add hierarchical all2all (#4982)
1 year ago
Xuanlei Zhao f71e63b0f3
[moe] support optimizer checkpoint (#5015)
1 year ago
Xuanlei Zhao dc003c304c
[moe] merge moe into main (#4978)
1 year ago