Commit Graph

29 Commits (30a94431323d71c5ef06bd4b7f047aced3312fdf)

Author SHA1 Message Date
Hongxin Liu cd61353bae
[pipeline] hotfix backward for multiple outputs (#6090)
1 month ago
Hongxin Liu 13946c4448
[fp8] hotfix backward hook (#6053)
3 months ago
Wang Binluo eea37da6fa
[fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016)
3 months ago
pre-commit-ci[bot] 51f916b11d [pre-commit.ci] auto fixes from pre-commit.com hooks
5 months ago
BurkeHulk e88190184a support fp8 communication in pipeline parallelism
5 months ago
Edenzzzz 2a25a2aff7
[Feature] optimize PP overlap (#5735)
5 months ago
Hongxin Liu bbb2c21f16
[shardformer] fix chatglm implementation (#5644)
7 months ago
Wenhao Chen bb0a668fee
[hotfix] set return_outputs=False in examples and polish code (#5404)
8 months ago
Frank Lee 7cfed5f076
[feat] refactored extension module (#5298)
10 months ago
ver217 148469348a Merge branch 'main' into sync/npu
10 months ago
Wenhao Chen ef4f0ee854
[hotfix]: add pp sanity check and fix mbs arg (#5268)
11 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
Elsa Granger d565df3821
[pipeline] A more general _communicate in p2p (#5062)
11 months ago
Wenhao Chen d799a3088f
[pipeline]: add p2p fallback order and fix interleaved pp deadlock (#5214)
11 months ago
Wenhao Chen 3c0d82b19b
[pipeline]: support arbitrary batch size in forward_only mode (#5201)
11 months ago
Wenhao Chen 4fa689fca1
[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp (#5134)
11 months ago
Hongxin Liu e5ce4c8ea6
[npu] add npu support for gemini and zero (#5067)
1 year ago
Elsa Granger b2ad0d9e8f
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark (#5017)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Baizhou Zhang 660eed9124
[pipeline] set optimizer to optional in execute_pipeline (#4630)
1 year ago
flybird11111 0a94fcd351
[shardformer] update bert finetune example with HybridParallelPlugin (#4584)
1 year ago
Jianghai 24c0768795
[shardformer] Pytree fix (#4533)
1 year ago
Hongxin Liu 508ca36fe3
[pipeline] 1f1b schedule receive microbatch size (#4589)
1 year ago
Jianghai 376533a564
[shardformer] zero1+pp and the corresponding tests (#4517)
1 year ago
LuGY a78daf6180
[shardformer] support interleaved pipeline (#4448)
1 year ago
Baizhou Zhang ed4c448488 [pipeline] rewrite t5 tests & support multi-tensor transmitting in pipeline (#4388)
1 year ago
Jianghai f13954cd58 [pipeline] refactor test pipeline and remove useless utils in pipeline (#4324)
1 year ago
Jianghai e7cc62d735 [pipeline] All bert models (#4233)
1 year ago
Hongxin Liu f51ce1bc8e [pipeline] refactor 1f1b schedule (#4115)
1 year ago