Commit Graph

1200 Commits (e0c68ab6d3d64f401208d6ec66815995cee233c3)

Author SHA1 Message Date
duanjunwen e0c68ab6d3
[Zerobubble] merge main. (#6142)
5 days ago
flybird11111 eb69e640e5 [async io]supoort async io (#6137)
5 days ago
Wang Binluo 8e08c27e19 [ckpt] Add async ckpt api (#6136)
5 days ago
Hongxin Liu a2596519fd
[zero] support extra dp (#6123)
2 weeks ago
Hongxin Liu a15ab139ad
[plugin] support get_grad_norm (#6115)
3 weeks ago
Wang Binluo dcd41d0973
Merge pull request #6071 from wangbluo/ring_attention
1 month ago
wangbluo fd92789af2 fix
1 month ago
wangbluo 6be9862aaf fix
1 month ago
wangbluo 3dc08c8a5a fix
1 month ago
wangbluo 8ff7d0c780 fix
1 month ago
wangbluo 3201377e94 fix
1 month ago
wangbluo 23199e34cc fix
1 month ago
wangbluo 703bb5c18d fix the test
1 month ago
wangbluo 4e0e99bb6a fix the test
1 month ago
Hongxin Liu dc2cdaf3e8
[shardformer] optimize seq parallelism (#6086)
1 month ago
Hongxin Liu 646b3c5a90
[shardformer] fix linear 1d row and support uneven splits for fused qkv linear (#6084)
2 months ago
botbw 4fa6b9509c
[moe] add parallel strategy for shared_expert && fix test for deepseek (#6063)
2 months ago
Guangyao Zhang f20b066c59
[fp8] Disable all_gather intranode. Disable Redundant all_gather fp8 (#6059)
2 months ago
botbw c54c4fcd15
[hotfix] moe hybrid parallelism benchmark & follow-up fix (#6048)
2 months ago
Wenxuan Tan 8fd25d6e09
[Feature] Split cross-entropy computation in SP (#5959)
3 months ago
Hongxin Liu b3db1058ec
[release] update version (#6041)
3 months ago
Wang Binluo eea37da6fa
[fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016)
3 months ago
flybird11111 20722a8c93
[fp8]update reduce-scatter test (#6002)
3 months ago
flybird11111 597b206001
[fp8] support asynchronous FP8 communication (#5997)
3 months ago
Hongxin Liu 8241c0c054
[fp8] support gemini plugin (#5978)
4 months ago
Hanks b480eec738
[Feature]: support FP8 communication in DDP, FSDP, Gemini (#5928)
4 months ago
Hongxin Liu ccabcf6485
[fp8] support fp8 amp for hybrid parallel plugin (#5975)
4 months ago
Hongxin Liu 76ea16466f
[fp8] add fp8 linear (#5967)
4 months ago
flybird11111 afb26de873
[fp8]support all2all fp8 (#5953)
4 months ago
flybird11111 0c10afd372
[FP8] rebase main (#5963)
4 months ago
Guangyao Zhang 53cb9606bd
[Feature] llama shardformer fp8 support (#5938)
4 months ago
ver217 91e596d017 [test] add zero fp8 test case
4 months ago
Hongxin Liu 5fd0592767
[fp8] support all-gather flat tensor (#5932)
4 months ago
pre-commit-ci[bot] 7c2f79fa98
[pre-commit.ci] pre-commit autoupdate (#5572)
5 months ago
Haze188 416580b314
[MoE/ZeRO] Moe refactor with zero refactor (#5821)
5 months ago
Guangyao Zhang d9d5e7ea1f
[shardformer] Support the T5ForTokenClassification model (#5816)
5 months ago
Edenzzzz 2a25a2aff7
[Feature] optimize PP overlap (#5735)
5 months ago
Guangyao Zhang fd1dc417d8
[shardformer] Change atol in test command-r weight-check to pass pytest (#5835)
5 months ago
GuangyaoZhang fe2e74c03a fix precommit
5 months ago
GuangyaoZhang 98da648a4a Fix Code Factor check
5 months ago
GuangyaoZhang f656d61778 change command
5 months ago
Edenzzzz 8795bb2e80
Support 4d parallel + flash attention (#5789)
5 months ago
flybird11111 2ddf624a86
[shardformer] upgrade transformers to 4.39.3 (#5815)
5 months ago
Li Xingjian 8554585a5f
[Inference] Fix flash-attn import and add model test (#5794)
6 months ago
Guangyao Zhang aac941ef78
[test] fix qwen2 pytest distLarge (#5797)
6 months ago
Hongxin Liu 587bbf4c6d
[test] fix chatglm test kit (#5793)
6 months ago
char-1ee b303976a27 Fix test import
6 months ago
char-1ee 5f398fc000 Pass inference model shard configs for module init
6 months ago
duanjunwen 10a19e22c6
[hotfix] fix testcase in test_fx/test_tracer (#5779)
6 months ago
botbw 80c3c8789b
[Test/CI] remove test cases to reduce CI duration (#5753)
6 months ago