Commit Graph

43 Commits (8cc8f645cd1d971a3bef52f625b7881f17c6d22b)

Author SHA1 Message Date
Hongxin Liu c068ef0fa0
[zero] support all-gather overlap (#5898)
5 months ago
Haze188 ea94c07b95
[hotfix] fix the bug that large tensor exceed the maximum capacity of TensorBucket (#5879)
5 months ago
Haze188 416580b314
[MoE/ZeRO] Moe refactor with zero refactor (#5821)
5 months ago
Hongxin Liu 5dfbcd7746
[zero] use bucket during allgather (#5860)
5 months ago
Edenzzzz 43995ee436
[Feature] Distributed optimizers: Lamb, Galore, CAME and Adafactor (#5694)
7 months ago
flybird11111 77ec773388
[zero]remove registered gradients hooks (#5687)
7 months ago
linsj20 91fa553775 [Feature] qlora support (#5586)
7 months ago
Hongxin Liu 3788fefc7a
[zero] support multiple (partial) backward passes (#5596)
8 months ago
Zhongkai Zhao 8e412a548e
[shardformer] Sequence Parallelism Optimization (#5533)
8 months ago
ver217 06db94fbc9 [moe] fix tests
10 months ago
Hongxin Liu da39d21b71 [moe] support mixtral (#5309)
10 months ago
Xuanlei Zhao 7d8e0338a4 [moe] init mixtral impl
10 months ago
Frank Lee 8823cc4831
Merge pull request #5310 from hpcaitech/feature/npu
10 months ago
digger yu bce9499ed3
fix some typo (#5307)
10 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
Hongxin Liu e5ce4c8ea6
[npu] add npu support for gemini and zero (#5067)
1 year ago
littsk 1a3315e336
[hotfix] Add layer norm gradients all-reduce for sequence parallel (#4926)
1 year ago
Xuanlei Zhao dc003c304c
[moe] merge moe into main (#4978)
1 year ago
Zhongkai Zhao a0684e7bd6
[feature] support no master weights option for low level zero plugin (#4816)
1 year ago
littsk 83b52c56cd
[feature] Add clip_grad_norm for hybrid_parallel_plugin (#4837)
1 year ago
Baizhou Zhang c0a033700c
[shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Hongxin Liu fae6c92ead
Merge branch 'main' into feature/shardformer
1 year ago
Hongxin Liu 807e01a4ba
[zero] hotfix master param sync (#4618)
1 year ago
Hongxin Liu a39a5c66fe
Merge branch 'main' into feature/shardformer
1 year ago
Hongxin Liu 63ecafb1fb
[checkpointio] optimize zero optim checkpoint io (#4591)
1 year ago
LuGY cbac782254
[zero]fix zero ckptIO with offload (#4529)
1 year ago
flybird11111 ec18fc7340
[shardformer] support pp+tp+zero1 tests (#4531)
1 year ago
Jianghai 376533a564
[shardformer] zero1+pp and the corresponding tests (#4517)
1 year ago
LuGY 839847b7d7
[zero]support zero2 with gradient accumulation (#4511)
1 year ago
LuGY d86ddd9b29
[hotfix] fix unsafe async comm in zero (#4404)
1 year ago
LuGY 45b08f08cb [zero] optimize the optimizer step time (#4221)
1 year ago
LuGY 1a49a5ea00 [zero] support shard optimizer state dict of zero (#4194)
1 year ago
LuGY dd7cc58299 [zero] add state dict for low level zero (#4179)
1 year ago
LuGY c668801d36 [zero] allow passing process group to zero12 (#4153)
1 year ago
LuGY 79cf1b5f33 [zero]support no_sync method for zero1 plugin (#4138)
1 year ago
LuGY c6ab96983a [zero] refactor low level zero for shard evenly (#4030)
1 year ago
digger yu de0d7df33f
[nfc] fix typo colossalai/zero (#3923)
2 years ago
Hongxin Liu ae02d4e4f7
[bf16] add bf16 support (#3882)
2 years ago
YH a22407cc02
[zero] Suggests a minor change to confusing variable names in the ZeRO optimizer. (#3173)
2 years ago
Hongxin Liu 4b3240cb59
[booster] add low level zero plugin (#3594)
2 years ago
Hongxin Liu 173dad0562
[misc] add verbose arg for zero and op builder (#3552)
2 years ago
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424)
2 years ago