1407 Commits (997544c1f90b9a1549e91a6d97ee3902c2ac0ed4)

Author SHA1 Message Date
FoolPlayer 997544c1f9 [shardformer] update readme with modules implement doc (#3834) 1 year ago
Frank Lee 537a52b7a2 [shardformer] refactored the user api (#3828) 1 year ago
Frank Lee bc19024bf9 [shardformer] updated readme (#3827) 1 year ago
FoolPlayer 58f6432416 [shardformer]: Feature/shardformer, add some docstring and readme (#3816) 1 year ago
FoolPlayer 6a69b44dfc [shardformer] init shardformer code structure (#3731) 1 year ago
Frank Lee eb39154d40
[dtensor] updated api and doc (#3845) 1 year ago
Hongxin Liu 9c88b6cbd1
[lazy] fix compatibility problem on torch 1.13 (#3911) 1 year ago
digger yu 0e484e6201
[nfc]fix typo colossalai/pipeline tensor nn (#3899) 1 year ago
Baizhou Zhang c1535ccbba
[doc] fix docs about booster api usage (#3898) 1 year ago
digger yu 1878749753
[nfc] fix typo colossalai/nn (#3887) 1 year ago
Hongxin Liu ae02d4e4f7
[bf16] add bf16 support (#3882) 1 year ago
Liu Ziming 8065cc5fba
Modify torch version requirement to adapt torch 2.0 (#3896) 1 year ago
Hongxin Liu dbb32692d2
[lazy] refactor lazy init (#3891) 1 year ago
digger yu 70c8cdecf4
[nfc] fix typo colossalai/cli fx kernel (#3847) 1 year ago
digger yu e2d81eba0d
[nfc] fix typo colossalai/ applications/ (#3831) 2 years ago
wukong1992 3229f93e30
[booster] add warning for torch fsdp plugin doc (#3833) 2 years ago
Hongxin Liu 7c9f2ed6dd
[dtensor] polish sharding spec docstring (#3838) 2 years ago
digger yu 7f8203af69
fix typo colossalai/auto_parallel autochunk fx/passes etc. (#3808) 2 years ago
wukong1992 6b305a99d6
[booster] torch fsdp fix ckpt (#3788) 2 years ago
digger yu 9265f2d4d7
[NFC]fix typo colossalai/auto_parallel nn utils etc. (#3779) 2 years ago
jiangmingyan e871e342b3
[API] add docstrings and initialization to apex amp, naive amp (#3783) 2 years ago
Frank Lee f5c425c898
fixed the example docstring for booster (#3795) 2 years ago
Hongxin Liu 72688adb2f
[doc] add booster docstring and fix autodoc (#3789) 2 years ago
Hongxin Liu 3c07a2846e
[plugin] a workaround for zero plugins' optimizer checkpoint (#3780) 2 years ago
Hongxin Liu 60e6a154bc
[doc] add tutorial for booster checkpoint (#3785) 2 years ago
digger yu 32f81f14d4
[NFC] fix typo colossalai/amp auto_parallel autochunk (#3756) 2 years ago
Hongxin Liu 5452df63c5
[plugin] torch ddp plugin supports sharded model checkpoint (#3775) 2 years ago
jiangmingyan 2703a37ac9
[amp] Add naive amp demo (#3774) 2 years ago
digger yu 1baeb39c72
[NFC] fix typo with colossalai/auto_parallel/tensor_shard (#3742) 2 years ago
wukong1992 b37797ed3d
[booster] support torch fsdp plugin in booster (#3697) 2 years ago
digger-yu ad6460cf2c
[NFC] fix typo applications/ and colossalai/ (#3735) 2 years ago
digger-yu b7141c36dd
[CI] fix some spelling errors (#3707) 2 years ago
jiangmingyan 20068ba188
[booster] add tests for ddp and low level zero's checkpointio (#3715) 2 years ago
Hongxin Liu 6552cbf8e1
[booster] fix no_sync method (#3709) 2 years ago
Hongxin Liu 3bf09efe74
[booster] update prepare dataloader method for plugin (#3706) 2 years ago
Hongxin Liu f83ea813f5
[example] add train resnet/vit with booster example (#3694) 2 years ago
YH 2629f9717d
[tensor] Refactor handle_trans_spec in DistSpecManager 2 years ago
Hongxin Liu d0915f54f4
[booster] refactor all dp fashion plugins (#3684) 2 years ago
jiangmingyan 307894f74d
[booster] gemini plugin support shard checkpoint (#3610) 2 years ago
YH a22407cc02
[zero] Suggests a minor change to confusing variable names in the ZeRO optimizer. (#3173) 2 years ago
Hongxin Liu 50793b35f4
[gemini] accelerate inference (#3641) 2 years ago
Hongxin Liu 4b3240cb59
[booster] add low level zero plugin (#3594) 2 years ago
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618) 2 years ago
Hongxin Liu 12eff9eb4c
[gemini] state dict supports fp16 (#3590) 2 years ago
Hongxin Liu dac127d0ee
[fx] fix meta tensor registration (#3589) 2 years ago
Hongxin Liu f313babd11
[gemini] support save state dict in shards (#3581) 2 years ago
YH d329c294ec
Add docstr for zero3 chunk search utils (#3572) 2 years ago
Hongxin Liu 173dad0562
[misc] add verbose arg for zero and op builder (#3552) 2 years ago
Hongxin Liu 4341f5e8e6
[lazyinit] fix clone and deepcopy (#3553) 2 years ago
Hongxin Liu 152239bbfa
[gemini] gemini supports lazy init (#3379) 2 years ago