48 Commits (colossalchat)

Author SHA1 Message Date
botbw 8e718a1421
[gemini] fixes for benchmarking (#5847) 5 months ago
botbw 3f7e3131d9
[gemini] optimize reduce scatter d2h copy (#5760) 6 months ago
botbw 2fc85abf43
[gemini] async grad chunk reduce (all-reduce&reduce-scatter) (#5713) 6 months ago
pre-commit-ci[bot] 5bedea6e10 [pre-commit.ci] auto fixes from pre-commit.com hooks 6 months ago
hxwang b2e9745888 [chore] sync 6 months ago
Baizhou Zhang 14b0d4c7e5 [lora] add lora APIs for booster, support lora for TorchDDP (#4981) 7 months ago
Hongxin Liu 4de4e31818
[exampe] update llama example (#5626) 7 months ago
flybird11111 a0ad587c24
[shardformer] refactor embedding resize (#5603) 7 months ago
digger yu 5e1c93d732
[hotfix] fix typo change MoECheckpintIO to MoECheckpointIO (#5335) 9 months ago
Hongxin Liu 6c0fa7b9a8
[llama] fix dataloader for hybrid parallel (#5358) 10 months ago
Frank Lee 9102d655ab
[hotfix] removed unused flag (#5242) 11 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239) 11 months ago
flybird11111 365671be10
fix-test (#5210) 11 months ago
flybird11111 21aa5de00b
[gemini] hotfix NaN loss while using Gemini + tensor_parallel (#5150) 12 months ago
github-actions[bot] 8921a73c90
[format] applied code formatting on changed files in pull request 5067 (#5072) 1 year ago
Hongxin Liu e5ce4c8ea6
[npu] add npu support for gemini and zero (#5067) 1 year ago
flybird11111 3e02154710
[gemini] gemini support extra-dp (#5043) 1 year ago
flybird11111 576a2f7b10
[gemini] gemini support tensor parallelism. (#4942) 1 year ago
Baizhou Zhang c040d70aa0
[hotfix] fix the bug of repeatedly storing param group (#4951) 1 year ago
Baizhou Zhang 21ba89cab6
[gemini] support gradient accumulation (#4869) 1 year ago
Hongxin Liu df63564184
[gemini] support amp o3 for gemini (#4872) 1 year ago
Baizhou Zhang a2db75546d
[doc] polish shardformer doc (#4779) 1 year ago
Baizhou Zhang c0a033700c
[shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758) 1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752) 1 year ago
Baizhou Zhang e79b1e80e2
[checkpointio] support huggingface from_pretrained for all plugins (#4606) 1 year ago
Hongxin Liu 27061426f7
[gemini] improve compatibility and add static placement policy (#4479) 1 year ago
LuGY 79cf1b5f33 [zero]support no_sync method for zero1 plugin (#4138) 1 year ago
Baizhou Zhang c6f6005990
[checkpointio] Sharded Optimizer Checkpoint for Gemini Plugin (#4302) 1 year ago
Baizhou Zhang 58913441a1
Next commit [checkpointio] Unsharded Optimizer Checkpoint for Gemini Plugin (#4141) 1 year ago
Baizhou Zhang 0bb0b481b4 [gemini] fix argument naming during chunk configuration searching 1 year ago
Wenhao Chen 725af3eeeb
[booster] make optimizer argument optional for boost (#3993) 1 year ago
Baizhou Zhang c9cff7e7fa
[checkpointio] General Checkpointing of Sharded Optimizers (#3984) 1 year ago
Frank Lee 71fe52769c [gemini] fixed the gemini checkpoint io (#3934) 1 year ago
Frank Lee bd1ab98158
[gemini] fixed the gemini checkpoint io (#3934) 1 year ago
Hongxin Liu ae02d4e4f7
[bf16] add bf16 support (#3882) 1 year ago
digger yu 7f8203af69
fix typo colossalai/auto_parallel autochunk fx/passes etc. (#3808) 2 years ago
Hongxin Liu 3c07a2846e
[plugin] a workaround for zero plugins' optimizer checkpoint (#3780) 2 years ago
Hongxin Liu 6552cbf8e1
[booster] fix no_sync method (#3709) 2 years ago
Hongxin Liu 3bf09efe74
[booster] update prepare dataloader method for plugin (#3706) 2 years ago
Hongxin Liu d0915f54f4
[booster] refactor all dp fashion plugins (#3684) 2 years ago
jiangmingyan 307894f74d
[booster] gemini plugin support shard checkpoint (#3610) 2 years ago
Hongxin Liu 173dad0562
[misc] add verbose arg for zero and op builder (#3552) 2 years ago
Hongxin Liu 152239bbfa
[gemini] gemini supports lazy init (#3379) 2 years ago
Frank Lee 7d8d825681
[booster] fixed the torch ddp plugin with the new checkpoint api (#3442) 2 years ago
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424) 2 years ago
ver217 5f2e34e6c9
[booster] implement Gemini plugin (#3352) 2 years ago