Commit Graph

56 Commits (fix-setup)

Author SHA1 Message Date
Hongxin Liu 7f8b16635b
[misc] refactor launch API and tensor constructor (#5666)
7 months ago
Hongxin Liu 1b387ca9fe
[shardformer] refactor pipeline grad ckpt config (#5646)
7 months ago
Hongxin Liu 641b1ee71a
[devops] remove post commit ci (#5566)
8 months ago
Wenhao Chen e614aa34f3
[shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous shard policy for llama (#5508)
8 months ago
Insu Jang 00525f7772
[shardformer] fix pipeline forward error if custom layer distribution is used (#5189)
8 months ago
Wenhao Chen bb0a668fee
[hotfix] set return_outputs=False in examples and polish code (#5404)
8 months ago
ver217 148469348a Merge branch 'main' into sync/npu
10 months ago
Wenhao Chen ef4f0ee854
[hotfix]: add pp sanity check and fix mbs arg (#5268)
10 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
Elsa Granger d565df3821
[pipeline] A more general _communicate in p2p (#5062)
11 months ago
Wenhao Chen d799a3088f
[pipeline]: add p2p fallback order and fix interleaved pp deadlock (#5214)
11 months ago
Wenhao Chen 4fa689fca1
[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp (#5134)
11 months ago
Wenhao Chen 7172459e74
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Hongxin Liu b5f9e37c70
[legacy] clean up legacy code (#4743)
1 year ago
Hongxin Liu 554aa9592e
[legacy] move communication and nn to legacy and refactor logger (#4671)
1 year ago
Baizhou Zhang 660eed9124
[pipeline] set optimizer to optional in execute_pipeline (#4630)
1 year ago
Hongxin Liu fae6c92ead
Merge branch 'main' into feature/shardformer
1 year ago
Hongxin Liu 89fe027787 [legacy] move trainer to legacy (#4545)
1 year ago
Hongxin Liu a39a5c66fe
Merge branch 'main' into feature/shardformer
1 year ago
Hongxin Liu 508ca36fe3
[pipeline] 1f1b schedule receive microbatch size (#4589)
1 year ago
Hongxin Liu 27061426f7
[gemini] improve compatibility and add static placement policy (#4479)
1 year ago
Jianghai 8739aa7fa0
[shardformer] Pipeline/whisper (#4456)
1 year ago
LuGY a78daf6180
[shardformer] support interleaved pipeline (#4448)
1 year ago
github-actions[bot] d20dceb9a3
[format] applied code formatting on changed files in pull request 4441 (#4445)
1 year ago
Jianghai a88e92251d [pipeline] add chatglm (#4363)
1 year ago
Jianghai f13954cd58 [pipeline] refactor test pipeline and remove useless utils in pipeline (#4324)
1 year ago
LuGY d3c6cd66f3 [pipeline] add unit test for 1f1b (#4303)
1 year ago
Baizhou Zhang 36e546b2cc [pipeline] add pipeline support for T5Stack/T5EncoderModel (#4300)
1 year ago
Jianghai d8408d185c [pipeline] OPT model pipeline (#4258)
1 year ago
Jianghai e7cc62d735 [pipeline] All bert models (#4233)
1 year ago
Jianghai f3bcc292c8 [pipeline] move bert related pipeline components to shardformer (#4187)
1 year ago
Jianghai c5ea728016 [pipeline] add bert_for_pretraining bert_lmhead forward and policy (#4172)
1 year ago
Jianghai 90a65ea682 [pipeline] build bloom model and policy , revise the base class of policy (#4161)
1 year ago
Jianghai c552cefa93 [pipeline]add pipeline policy and bert forward (#4130)
1 year ago
Hongxin Liu 5c897ddb94 [pipeline] add stage manager (#4093)
1 year ago
Jianghai e8e7e49243 [pipeline]add pipeline policy and bert forward (#4130)
1 year ago
Hongxin Liu f51ce1bc8e [pipeline] refactor 1f1b schedule (#4115)
1 year ago
Hongxin Liu 45fdc9b42c [pipeline] implement p2p communication (#4100)
1 year ago
Hongxin Liu 422544222f [pipeline] add stage manager (#4093)
1 year ago
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452)
2 years ago
Ziyue Jiang 09d69e1c25
[PP Middleware] Add bwd and step for PP middleware (#2111)
2 years ago
Ziyue Jiang e4705ba4e2
[Pipeline Middleware] fix data race in Pipeline Scheduler for DAG (#2087)
2 years ago
Ziyue Jiang 597cdd3006
[Pipeline Middleware] Adapt scheduler for Topo (#2066)
2 years ago
Ziyue Jiang b0936e4a44
[rpc] split with dag (#2028)
2 years ago
Super Daniel 393f594051
[fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions with compatibility checks / remove color debug (#1710)
2 years ago
Kirigaya Kazuto 9708638ded
[pipeline/pytree] add pytree to process args and kwargs | provide `data_process_func` to process args and kwargs after forward (#1642)
2 years ago
Kirigaya Kazuto 170fa81095
[pipeline/chimera] test chimera | fix bug of initializing (#1615)
2 years ago
Kirigaya Kazuto edc9e419ad
[pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera (#1595)
2 years ago
Kirigaya Kazuto 6159d45417
[pipeline/tuning] improve dispatch performance both time and space cost (#1544)
2 years ago