Commit Graph

24 Commits (d4a436051d4827795f6f36e2a62dbbcc5fc4890f)

Author SHA1 Message Date
Hongxin Liu d4a436051d [checkpointio] support async model save (#6131)
1 week ago
BurkeHulk 6d6cafabe2 pre-commit fix
1 month ago
BurkeHulk b10339df7c fix lora ckpt save format (ColoTensor to Tensor)
1 month ago
Guangyao Zhang bdb125f83f
[doc] FP8 training and communication document (#6050)
3 months ago
Wang Binluo eea37da6fa
[fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016)
3 months ago
Hanks b480eec738
[Feature]: support FP8 communication in DDP, FSDP, Gemini (#5928)
4 months ago
Hongxin Liu 7f8b16635b
[misc] refactor launch API and tensor constructor (#5666)
7 months ago
linsj20 91fa553775 [Feature] qlora support (#5586)
7 months ago
Baizhou Zhang 14b0d4c7e5 [lora] add lora APIs for booster, support lora for TorchDDP (#4981)
7 months ago
Baizhou Zhang a2db75546d
[doc] polish shardformer doc (#4779)
1 year ago
Baizhou Zhang c0a033700c
[shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
LuGY 79cf1b5f33 [zero]support no_sync method for zero1 plugin (#4138)
1 year ago
Baizhou Zhang 822c3d4d66
[checkpointio] sharded optimizer checkpoint for DDP plugin (#4002)
1 year ago
Wenhao Chen 725af3eeeb
[booster] make optimizer argument optional for boost (#3993)
1 year ago
Baizhou Zhang c9cff7e7fa
[checkpointio] General Checkpointing of Sharded Optimizers (#3984)
1 year ago
Hongxin Liu 5452df63c5
[plugin] torch ddp plugin supports sharded model checkpoint (#3775)
2 years ago
Hongxin Liu 6552cbf8e1
[booster] fix no_sync method (#3709)
2 years ago
Hongxin Liu 3bf09efe74
[booster] update prepare dataloader method for plugin (#3706)
2 years ago
Hongxin Liu d0915f54f4
[booster] refactor all dp fashion plugins (#3684)
2 years ago
Frank Lee 7d8d825681
[booster] fixed the torch ddp plugin with the new checkpoint api (#3442)
2 years ago
Frank Lee 1beb85cc25
[checkpoint] refactored the API and added safetensors support (#3427)
2 years ago
Frank Lee 73d3e4d309
[booster] implemented the torch ddd + resnet example (#3232)
2 years ago
Frank Lee e7f3bed2d3
[booster] added the plugin base and torch ddp plugin (#3180)
2 years ago