6 Commits (ckpt)

Author SHA1 Message Date
botbw 696fced0d7
[fp8] fix missing fp8_comm flag in mixtral (#6057) 2 months ago
botbw c54c4fcd15
[hotfix] moe hybrid parallelism benchmark & follow-up fix (#6048) 2 months ago
Wang Binluo eea37da6fa
[fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016) 3 months ago
wangbluo eb5ba40def fix the merge 3 months ago
pre-commit-ci[bot] 81272e9d00 [pre-commit.ci] auto fixes from pre-commit.com hooks 3 months ago
flybird11111 f1a3a326c4
[fp8]Moe support fp8 communication (#5977) 4 months ago
flybird11111 0c10afd372
[FP8] rebase main (#5963) 4 months ago
botbw 62cdac6b7b [chore] remove redundant test case, print string & reduce test tokens 4 months ago
hxwang cb01c0d5ce [moe] refactor mesh assignment 4 months ago
hxwang 70c9924d0d [chore] solve moe ckpt test failure and some other arg pass failure 4 months ago
hxwang 803878b2fd [moe] full test for deepseek and mixtral (pp + sp to fix) 4 months ago
hxwang 877d94bb8c [moe] init moe plugin comm setting with sp 4 months ago
Haze188 404b16faf3 [Feature] MoE Ulysses Support (#5918) 4 months ago
hxwang 3e2b6132b7 [moe] clean legacy code 4 months ago
hxwang 74eccac0db [moe] test deepseek 4 months ago
botbw dc583aa576 [moe] implement tp 4 months ago
hxwang 102b784a10 [chore] arg pass & remove drop token 4 months ago
botbw 9b9b76bdcd [moe] add mixtral dp grad scaling when not all experts are activated 4 months ago
botbw b5bfeb2efd [moe] implement transit between non moe tp and ep 4 months ago
hxwang 0b76b57cd6 [test] add mixtral transformer test 4 months ago
Haze188 416580b314
[MoE/ZeRO] Moe refactor with zero refactor (#5821) 5 months ago
Wenhao Chen e614aa34f3
[shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous shard policy for llama (#5508) 8 months ago
Insu Jang 00525f7772
[shardformer] fix pipeline forward error if custom layer distribution is used (#5189) 8 months ago
Hongxin Liu 956b561b54 [moe] fix mixtral forward default value (#5329) 10 months ago
Hongxin Liu da39d21b71 [moe] support mixtral (#5309) 10 months ago
Xuanlei Zhao 7d8e0338a4 [moe] init mixtral impl 10 months ago
digger yu 71321a07cf
fix typo change dosen't to doesn't (#5308) 10 months ago
Xuanlei Zhao dc003c304c
[moe] merge moe into main (#4978) 1 year ago