14 Commits (a52f62082de0f4b4544ba2d04e909f74123425ce)

Author SHA1 Message Date
Frank Lee a5883aa790
[test] fixed codefactor format report (#4026) 1 year ago
Baizhou Zhang 822c3d4d66
[checkpointio] sharded optimizer checkpoint for DDP plugin (#4002) 1 year ago
Baizhou Zhang c9cff7e7fa
[checkpointio] General Checkpointing of Sharded Optimizers (#3984) 1 year ago
wukong1992 6b305a99d6
[booster] torch fsdp fix ckpt (#3788) 2 years ago
Hongxin Liu 3c07a2846e
[plugin] a workaround for zero plugins' optimizer checkpoint (#3780) 2 years ago
Hongxin Liu 5452df63c5
[plugin] torch ddp plugin supports sharded model checkpoint (#3775) 2 years ago
Hongxin Liu afb239bbf8
[devops] update torch version of CI (#3725) 2 years ago
jiangmingyan 20068ba188
[booster] add tests for ddp and low level zero's checkpointio (#3715) 2 years ago
jiangmingyan 307894f74d
[booster] gemini plugin support shard checkpoint (#3610) 2 years ago
jiangmingyan 52a933e175
[checkpoint] support huggingface style sharded checkpoint (#3461) 2 years ago
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452) 2 years ago
Frank Lee 1beb85cc25
[checkpoint] refactored the API and added safetensors support (#3427) 2 years ago
Frank Lee 73d3e4d309
[booster] implemented the torch ddd + resnet example (#3232) 2 years ago
Frank Lee cd142fbefa
[api] implemented the checkpoint io module (#3205) 2 years ago