ColossalAI/tests/test_moe
botbw 9b9b76bdcd [moe] add mixtral dp grad scaling when not all experts are activated 2024-08-01 10:06:59 +08:00
..
moe_utils.py [zero] solve hang 2024-08-01 10:06:59 +08:00
test_deepseek_layer.py [shardformer] DeepseekMoE support (#5871) 2024-07-05 16:13:58 +08:00
test_grad_handler.py [MoE/ZeRO] Moe refactor with zero refactor (#5821) 2024-06-28 14:00:08 +08:00
test_kernel.py [MoE/ZeRO] Moe refactor with zero refactor (#5821) 2024-06-28 14:00:08 +08:00
test_mixtral_layer.py [MoE/ZeRO] Moe refactor with zero refactor (#5821) 2024-06-28 14:00:08 +08:00
test_moe_checkpoint.py [zero] solve hang 2024-08-01 10:06:59 +08:00
test_moe_ep_tp.py [misc] solve booster hang by rename the variable 2024-08-01 10:06:59 +08:00
test_moe_ep_zero.py [moe] add mixtral dp grad scaling when not all experts are activated 2024-08-01 10:06:59 +08:00
test_moe_group.py [MoE/ZeRO] Moe refactor with zero refactor (#5821) 2024-06-28 14:00:08 +08:00
test_moe_hybrid_zero.py [MoE/ZeRO] Moe refactor with zero refactor (#5821) 2024-06-28 14:00:08 +08:00
test_moe_load_balance.py [MoE/ZeRO] Moe refactor with zero refactor (#5821) 2024-06-28 14:00:08 +08:00