Commit Graph

18 Commits (ckpt)

Author SHA1 Message Date
hxwang 5b4c12381b Revert "[moe] implement submesh initialization"
4 months ago
Haze188 404b16faf3 [Feature] MoE Ulysses Support (#5918)
4 months ago
botbw 8dbb86899d [chore] trivial fix
4 months ago
botbw e28e05345b [moe] implement submesh initialization
4 months ago
hxwang 46c069b0db [zero] solve hang
4 months ago
hxwang 0fad23c691 [chore] handle non member group
4 months ago
Haze188 3420921101
[shardformer] DeepseekMoE support (#5871)
5 months ago
Haze188 416580b314
[MoE/ZeRO] Moe refactor with zero refactor (#5821)
5 months ago
Edenzzzz 2a25a2aff7
[Feature] optimize PP overlap (#5735)
5 months ago
Edenzzzz 43995ee436
[Feature] Distributed optimizers: Lamb, Galore, CAME and Adafactor (#5694)
7 months ago
Hongxin Liu 641b1ee71a
[devops] remove post commit ci (#5566)
8 months ago
Zhongkai Zhao 8e412a548e
[shardformer] Sequence Parallelism Optimization (#5533)
8 months ago
flybird11111 365671be10
fix-test (#5210)
11 months ago
flybird11111 576a2f7b10
[gemini] gemini support tensor parallelism. (#4942)
1 year ago
littsk be82b5d4ca
[hotfix] Fix the bug where process groups were not being properly released. (#4940)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
LuGY a78daf6180
[shardformer] support interleaved pipeline (#4448)
1 year ago
Hongxin Liu 5e1a9d48dd [cluster] add process group mesh (#4039)
1 year ago