Commit Graph

29 Commits (0b2d55c4ab518bd2e6e66195aaead28d7311ab8f)

Author SHA1 Message Date
hxwang 70c9924d0d [chore] solve moe ckpt test failure and some other arg pass failure
4 months ago
Hongxin Liu e86127925a
[plugin] support all-gather overlap for hybrid parallel (#5919)
4 months ago
Hongxin Liu c068ef0fa0
[zero] support all-gather overlap (#5898)
5 months ago
Edenzzzz 5f8c0a0ac3
[Feature] auto-cast optimizers to distributed version (#5746)
6 months ago
Edenzzzz 43995ee436
[Feature] Distributed optimizers: Lamb, Galore, CAME and Adafactor (#5694)
7 months ago
linsj20 91fa553775 [Feature] qlora support (#5586)
7 months ago
flybird11111 8954a0c2e2 [LowLevelZero] low level zero support lora (#5153)
7 months ago
Baizhou Zhang 14b0d4c7e5 [lora] add lora APIs for booster, support lora for TorchDDP (#4981)
7 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
Hongxin Liu e5ce4c8ea6
[npu] add npu support for gemini and zero (#5067)
1 year ago
Baizhou Zhang c040d70aa0
[hotfix] fix the bug of repeatedly storing param group (#4951)
1 year ago
Baizhou Zhang 21ba89cab6
[gemini] support gradient accumulation (#4869)
1 year ago
Zhongkai Zhao a0684e7bd6
[feature] support no master weights option for low level zero plugin (#4816)
1 year ago
shaoyuw c97a3523db fix: typo in comment of low_level_zero plugin
1 year ago
Baizhou Zhang a2db75546d
[doc] polish shardformer doc (#4779)
1 year ago
Baizhou Zhang c0a033700c
[shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Hongxin Liu 807e01a4ba
[zero] hotfix master param sync (#4618)
1 year ago
Hongxin Liu 63ecafb1fb
[checkpointio] optimize zero optim checkpoint io (#4591)
1 year ago
LuGY 1a49a5ea00 [zero] support shard optimizer state dict of zero (#4194)
1 year ago
LuGY 79cf1b5f33 [zero]support no_sync method for zero1 plugin (#4138)
1 year ago
梁爽 abe4f971e0 [NFC] polish colossalai/booster/plugin/low_level_zero_plugin.py code style (#4256)
1 year ago
Wenhao Chen 725af3eeeb
[booster] make optimizer argument optional for boost (#3993)
1 year ago
Hongxin Liu ae02d4e4f7
[bf16] add bf16 support (#3882)
2 years ago
Hongxin Liu 3c07a2846e
[plugin] a workaround for zero plugins' optimizer checkpoint (#3780)
2 years ago
Hongxin Liu 6552cbf8e1
[booster] fix no_sync method (#3709)
2 years ago
Hongxin Liu 3bf09efe74
[booster] update prepare dataloader method for plugin (#3706)
2 years ago
Hongxin Liu d0915f54f4
[booster] refactor all dp fashion plugins (#3684)
2 years ago
Hongxin Liu 4b3240cb59
[booster] add low level zero plugin (#3594)
2 years ago