ColossalAI

History

Hongxin Liu da39d21b71 [moe] support mixtral (#5309 ) * [moe] add mixtral block for single expert * [moe] mixtral block fwd support uneven ep * [moe] mixtral block bwd support uneven ep * [moe] add mixtral moe layer * [moe] simplify replace * [meo] support save sharded mixtral * [meo] support load sharded mixtral * [meo] support save sharded optim * [meo] integrate moe manager into plug * [meo] fix optimizer load * [meo] fix mixtral layer		2024-02-07 19:21:02 +08:00
..
__init__.py	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
dp_plugin_base.py	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
gemini_plugin.py	Merge branch 'main' into sync/npu	2024-01-18 12:05:21 +08:00
hybrid_parallel_plugin.py	[fix] remove unnecessary dp_size assert (#5351 )	2024-02-02 14:40:20 +08:00
low_level_zero_plugin.py	[npu] change device to accelerator api (#5239 )	2024-01-09 10:20:05 +08:00
moe_hybrid_parallel_plugin.py	[moe] support mixtral (#5309 )	2024-02-07 19:21:02 +08:00
plugin_base.py	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
pp_plugin_base.py	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
torch_ddp_plugin.py	[doc] polish shardformer doc (#4779 )	2023-09-26 10:57:47 +08:00
torch_fsdp_plugin.py	[doc] polish shardformer doc (#4779 )	2023-09-26 10:57:47 +08:00