ColossalAI

History

Baizhou Zhang 21ba89cab6 [gemini] support gradient accumulation (#4869 ) * add test * fix no_sync bug in low level zero plugin * fix test * add argument for grad accum * add grad accum in backward hook for gemini * finish implementation, rewrite tests * fix test * skip stuck model in low level zero test * update doc * optimize communication & fix gradient checkpoint * modify doc * cleaning codes * update cpu adam fp16 case		2023-10-17 14:07:21 +08:00
..
__init__.py	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
dp_plugin_base.py	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
gemini_plugin.py	[gemini] support gradient accumulation (#4869 )	2023-10-17 14:07:21 +08:00
hybrid_parallel_plugin.py	[feature] support no master weights option for low level zero plugin (#4816 )	2023-10-13 07:57:45 +00:00
low_level_zero_plugin.py	[gemini] support gradient accumulation (#4869 )	2023-10-17 14:07:21 +08:00
plugin_base.py	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
pp_plugin_base.py	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
torch_ddp_plugin.py	[doc] polish shardformer doc (#4779 )	2023-09-26 10:57:47 +08:00
torch_fsdp_plugin.py	[doc] polish shardformer doc (#4779 )	2023-09-26 10:57:47 +08:00