ColossalAI/tests/test_moe
Wenhao Chen 724441279b
[moe]: fix ep/tp tests, add hierarchical all2all (#4982)
* fix: add warning for EP different behavior

* fix: use shard_data in ep & tp model

* to: add used_capacity

* fix: fix router test

* feat: add create_ep_node_group

* feat: add create_ep_hierarchical_group fn

* feat: add HierarchicalAllToAll

* test: add hierarchical all2all test

* fix: fix test errors

* fix: simplify create_ep_hierarchical_group

* fix: add hierarchical_alltoall arg

* fix: fix environ typo

* revert: revert process mesh order

* to: add todo mark

* fix: skip hierarchical_comm if torch < 1.13.1
2023-11-09 06:31:00 +00:00
..
moe_utils.py [moe]: fix ep/tp tests, add hierarchical all2all (#4982) 2023-11-09 06:31:00 +00:00
test_grad_handler.py [moe] support optimizer checkpoint (#5015) 2023-11-08 15:07:03 +00:00
test_kernel.py [moe] support optimizer checkpoint (#5015) 2023-11-08 15:07:03 +00:00
test_moe_checkpoint.py [moe] support optimizer checkpoint (#5015) 2023-11-08 15:07:03 +00:00
test_moe_ep_tp.py [moe]: fix ep/tp tests, add hierarchical all2all (#4982) 2023-11-09 06:31:00 +00:00
test_moe_group.py [moe] support optimizer checkpoint (#5015) 2023-11-08 15:07:03 +00:00
test_moe_hybrid_zero.py [moe] support optimizer checkpoint (#5015) 2023-11-08 15:07:03 +00:00
test_moe_load_balance.py [moe] support optimizer checkpoint (#5015) 2023-11-08 15:07:03 +00:00
test_moe_router.py [moe]: fix ep/tp tests, add hierarchical all2all (#4982) 2023-11-09 06:31:00 +00:00
test_moe_zero_fwd_bwd.py [moe] support optimizer checkpoint (#5015) 2023-11-08 15:07:03 +00:00
test_moe_zero_optim.py [moe] support optimizer checkpoint (#5015) 2023-11-08 15:07:03 +00:00