43 Commits (39f2582e987871c198f2f2526cd4435cbd569741)

Author SHA1 Message Date
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752) 1 year ago
Hongxin Liu b5f9e37c70
[legacy] clean up legacy code (#4743) 1 year ago
Hongxin Liu 554aa9592e
[legacy] move communication and nn to legacy and refactor logger (#4671) 1 year ago
Baizhou Zhang 660eed9124
[pipeline] set optimizer to optional in execute_pipeline (#4630) 1 year ago
Hongxin Liu 89fe027787 [legacy] move trainer to legacy (#4545) 1 year ago
Hongxin Liu 508ca36fe3
[pipeline] 1f1b schedule receive microbatch size (#4589) 1 year ago
Hongxin Liu 27061426f7
[gemini] improve compatibility and add static placement policy (#4479) 1 year ago
Jianghai 8739aa7fa0
[shardformer] Pipeline/whisper (#4456) 1 year ago
LuGY a78daf6180
[shardformer] support interleaved pipeline (#4448) 1 year ago
github-actions[bot] d20dceb9a3
[format] applied code formatting on changed files in pull request 4441 (#4445) 1 year ago
Jianghai a88e92251d [pipeline] add chatglm (#4363) 1 year ago
Jianghai f13954cd58 [pipeline] refactor test pipeline and remove useless utils in pipeline (#4324) 1 year ago
LuGY d3c6cd66f3 [pipeline] add unit test for 1f1b (#4303) 1 year ago
Baizhou Zhang 36e546b2cc [pipeline] add pipeline support for T5Stack/T5EncoderModel (#4300) 1 year ago
Jianghai d8408d185c [pipeline] OPT model pipeline (#4258) 1 year ago
Jianghai e7cc62d735 [pipeline] All bert models (#4233) 1 year ago
Jianghai f3bcc292c8 [pipeline] move bert related pipeline components to shardformer (#4187) 1 year ago
Jianghai c5ea728016 [pipeline] add bert_for_pretraining bert_lmhead forward and policy (#4172) 1 year ago
Jianghai 90a65ea682 [pipeline] build bloom model and policy , revise the base class of policy (#4161) 1 year ago
Jianghai c552cefa93 [pipeline]add pipeline policy and bert forward (#4130) 1 year ago
Hongxin Liu 5c897ddb94 [pipeline] add stage manager (#4093) 1 year ago
Jianghai e8e7e49243 [pipeline]add pipeline policy and bert forward (#4130) 1 year ago
Hongxin Liu f51ce1bc8e [pipeline] refactor 1f1b schedule (#4115) 1 year ago
Hongxin Liu 45fdc9b42c [pipeline] implement p2p communication (#4100) 1 year ago
Hongxin Liu 422544222f [pipeline] add stage manager (#4093) 1 year ago
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452) 2 years ago
Ziyue Jiang 09d69e1c25
[PP Middleware] Add bwd and step for PP middleware (#2111) 2 years ago
Ziyue Jiang e4705ba4e2
[Pipeline Middleware] fix data race in Pipeline Scheduler for DAG (#2087) 2 years ago
Ziyue Jiang 597cdd3006
[Pipeline Middleware] Adapt scheduler for Topo (#2066) 2 years ago
Ziyue Jiang b0936e4a44
[rpc] split with dag (#2028) 2 years ago
Super Daniel 393f594051
[fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions with compatibility checks / remove color debug (#1710) 2 years ago
Kirigaya Kazuto 9708638ded
[pipeline/pytree] add pytree to process args and kwargs | provide `data_process_func` to process args and kwargs after forward (#1642) 2 years ago
Kirigaya Kazuto 170fa81095
[pipeline/chimera] test chimera | fix bug of initializing (#1615) 2 years ago
Kirigaya Kazuto edc9e419ad
[pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera (#1595) 2 years ago
Kirigaya Kazuto 6159d45417
[pipeline/tuning] improve dispatch performance both time and space cost (#1544) 2 years ago
Kirigaya Kazuto f1e1836218
[pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508) 2 years ago
Kirigaya Kazuto 5a6fd71f90
[pipeline/rpc] update outstanding mechanism | optimize dispatching strategy (#1497) 2 years ago
Kirigaya Kazuto 9145aef2b4
[pipeline/rpc] implement distributed optimizer | test with assert_close (#1486) 2 years ago
Kirigaya Kazuto a6c8749198
[pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B (#1483) 2 years ago
Kirigaya Kazuto bb5f5289e0
[pipeline/rpc] implement a demo for PP with cuda rpc framework (#1470) 2 years ago
Frank Lee 2b2dc1c86b
[pipeline] refactor the pipeline module (#1087) 2 years ago