ColossalAI/tests/test_pipeline
Kirigaya Kazuto f1e1836218
[pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508)
* support p2p communication with any type of object | pass test

* reconstruct pipeline schedule with p2p_v2.py(support communication with List[Any]) | pass test

* [engin/schedule] use p2p_v2 to recontruct pipeline_schedule

* [pipeline/rpc] implement a demo for PP with cuda rpc framework

* [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B

* [pipeline/rpc] implement distributed optimizer | test with assert_close

* [pipeline/rpc] implement distributed optimizer | test with assert_close

* [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy

* [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy

* [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy

* [pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP

* [pipeline/pipleline_process_group] remove comment

* [pipeline/pipleline_process_group] remove comment

* [pipeline/pipleline_process_group] skip process group test

* [pipeline/pipleline_process_group] remove test named function
2022-09-01 17:45:47 +08:00
..
rpc_test_utils.py [pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508) 2022-09-01 17:45:47 +08:00
test_cuda_rpc_optimizer.py [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy (#1497) 2022-08-26 14:04:23 +08:00
test_cuda_rpc_pipeline.py [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy (#1497) 2022-08-26 14:04:23 +08:00
test_cuda_rpc_value_correctness.py [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy (#1497) 2022-08-26 14:04:23 +08:00
test_pipelinable.py
test_pipeline_process_group.py [pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508) 2022-09-01 17:45:47 +08:00