Baizhou Zhang
a2db75546d
[doc] polish shardformer doc ( #4779 )
...
* fix example format in docstring
* polish shardformer doc
2023-09-26 10:57:47 +08:00
Baizhou Zhang
c0a033700c
[shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic ( #4758 )
...
* fix master param sync for hybrid plugin
* rewrite unwrap for ddp/fsdp
* rewrite unwrap for zero/gemini
* rewrite unwrap for hybrid plugin
* fix geemini unwrap
* fix bugs
2023-09-20 18:29:37 +08:00
Hongxin Liu
079bf3cb26
[misc] update pre-commit and run all files ( #4752 )
...
* [misc] update pre-commit
* [misc] run pre-commit
* [misc] remove useless configuration files
* [misc] ignore cuda for clang-format
2023-09-19 14:20:26 +08:00
Hongxin Liu
807e01a4ba
[zero] hotfix master param sync ( #4618 )
...
* [zero] add method to update master params
* [zero] update zero plugin
* [plugin] update low level zero plugin
2023-09-05 15:04:02 +08:00
Hongxin Liu
63ecafb1fb
[checkpointio] optimize zero optim checkpoint io ( #4591 )
...
* [zero] update checkpoint io to save memory
* [checkpointio] add device map to save memory
2023-09-04 11:26:45 +08:00
LuGY
1a49a5ea00
[zero] support shard optimizer state dict of zero ( #4194 )
...
* support shard optimizer of zero
* polish code
* support sync grad manually
2023-07-31 22:13:29 +08:00
LuGY
79cf1b5f33
[zero]support no_sync method for zero1 plugin ( #4138 )
...
* support no sync for zero1 plugin
* polish
* polish
2023-07-31 22:13:29 +08:00
梁爽
abe4f971e0
[NFC] polish colossalai/booster/plugin/low_level_zero_plugin.py code style ( #4256 )
...
Co-authored-by: supercooledith <893754954@qq.com>
2023-07-26 14:12:57 +08:00
Wenhao Chen
725af3eeeb
[booster] make optimizer argument optional for boost ( #3993 )
...
* feat: make optimizer optional in Booster.boost
* test: skip unet test if diffusers version > 0.10.2
2023-06-15 17:38:42 +08:00
Hongxin Liu
ae02d4e4f7
[bf16] add bf16 support ( #3882 )
...
* [bf16] add bf16 support for fused adam (#3844 )
* [bf16] fused adam kernel support bf16
* [test] update fused adam kernel test
* [test] update fused adam test
* [bf16] cpu adam and hybrid adam optimizers support bf16 (#3860 )
* [bf16] implement mixed precision mixin and add bf16 support for low level zero (#3869 )
* [bf16] add mixed precision mixin
* [bf16] low level zero optim support bf16
* [text] update low level zero test
* [text] fix low level zero grad acc test
* [bf16] add bf16 support for gemini (#3872 )
* [bf16] gemini support bf16
* [test] update gemini bf16 test
* [doc] update gemini docstring
* [bf16] add bf16 support for plugins (#3877 )
* [bf16] add bf16 support for legacy zero (#3879 )
* [zero] init context support bf16
* [zero] legacy zero support bf16
* [test] add zero bf16 test
* [doc] add bf16 related docstring for legacy zero
2023-06-05 15:58:31 +08:00
Hongxin Liu
3c07a2846e
[plugin] a workaround for zero plugins' optimizer checkpoint ( #3780 )
...
* [test] refactor torch ddp checkpoint test
* [plugin] update low level zero optim checkpoint
* [plugin] update gemini optim checkpoint
2023-05-19 19:42:31 +08:00
Hongxin Liu
6552cbf8e1
[booster] fix no_sync method ( #3709 )
...
* [booster] fix no_sync method
* [booster] add test for ddp no_sync
* [booster] fix merge
* [booster] update unit test
* [booster] update unit test
* [booster] update unit test
2023-05-09 11:10:02 +08:00
Hongxin Liu
3bf09efe74
[booster] update prepare dataloader method for plugin ( #3706 )
...
* [booster] add prepare dataloader method for plug
* [booster] update examples and docstr
2023-05-08 15:44:03 +08:00
Hongxin Liu
d0915f54f4
[booster] refactor all dp fashion plugins ( #3684 )
...
* [booster] add dp plugin base
* [booster] inherit dp plugin base
* [booster] refactor unit tests
2023-05-05 19:36:10 +08:00
Hongxin Liu
4b3240cb59
[booster] add low level zero plugin ( #3594 )
...
* [booster] add low level zero plugin
* [booster] fix gemini plugin test
* [booster] fix precision
* [booster] add low level zero plugin test
* [test] fix booster plugin test oom
* [test] fix booster plugin test oom
* [test] fix googlenet and inception output trans
* [test] fix diffuser clip vision model
* [test] fix torchaudio_wav2vec2_base
* [test] fix low level zero plugin test
2023-04-26 14:37:25 +08:00