Commit Graph

2822 Commits (3043d5d6769121639e0827cb8c0a587f130c9af3)
 

Author SHA1 Message Date
Jianghai d8408d185c [pipeline] OPT model pipeline (#4258)
1 year ago
Baizhou Zhang b774d5ea0f [pipeline] refactor gpt2 pipeline forwards (#4287)
1 year ago
Hongxin Liu d921ce8391 [shardformer] support inplace sharding (#4251)
1 year ago
Baizhou Zhang 2a2eacfaf1 [pipeline] support shardformer for GPT2ForQuestionAnswering & complete pipeline support for GPT2 (#4245)
1 year ago
Jianghai d9be0472ef [bugs] hot fix some testing bugs for new models (#4268)
1 year ago
Jianghai 34f0e34a4c [pipeline] finish bloom models pipeline and tests (#4223)
1 year ago
Jianghai e7cc62d735 [pipeline] All bert models (#4233)
1 year ago
Baizhou Zhang a14d352088 [pipeline] add pipeline forward for variants of gpt2 (#4238)
1 year ago
Hongxin Liu 7e4de520e1 [shardformer] fix base policy (#4229)
1 year ago
Baizhou Zhang 208ac8f2ba [pipeline] Add Pipeline Forward for GPT2Model Shardformer (#4224)
1 year ago
Jianghai 37d22f6878 [pipeline] add bloom model pipeline (#4210)
1 year ago
Jianghai 31bcf867ae [pipeline] Llama causal lm and llama for sequence classification pipeline (#4208)
1 year ago
Jianghai 1622031058 [pipeline] Llama pipeline (#4205)
1 year ago
Jianghai 1094e0f0d3 [pipeline] Bert pipeline for shardformer and its tests (#4197)
1 year ago
Hongxin Liu 890774b2fb [shardformer] support lazy init (#4202)
1 year ago
Jianghai f3bcc292c8 [pipeline] move bert related pipeline components to shardformer (#4187)
1 year ago
Jianghai c5ea728016 [pipeline] add bert_for_pretraining bert_lmhead forward and policy (#4172)
1 year ago
ver217 d35bd7d0e6 [shardformer] fix type hint
1 year ago
ver217 1ed3f8a24f [shardformer] rename policy file name
1 year ago
ver217 5fc60a3a04 [test] add shard util tests
1 year ago
ver217 2d6cc07feb [test] update shardformer tests
1 year ago
ver217 b0b8ad2823 [pipeline] update shardformer docstring
1 year ago
ver217 59f6f573f1 [pipeline] update shardformer policy
1 year ago
Jianghai 90a65ea682 [pipeline] build bloom model and policy , revise the base class of policy (#4161)
1 year ago
Jianghai c552cefa93 [pipeline]add pipeline policy and bert forward (#4130)
1 year ago
Hongxin Liu 5c897ddb94 [pipeline] add stage manager (#4093)
1 year ago
Jianghai e8e7e49243 [pipeline]add pipeline policy and bert forward (#4130)
1 year ago
Hongxin Liu f51ce1bc8e [pipeline] refactor 1f1b schedule (#4115)
1 year ago
Hongxin Liu 45fdc9b42c [pipeline] implement p2p communication (#4100)
1 year ago
Hongxin Liu 422544222f [pipeline] add stage manager (#4093)
1 year ago
Hongxin Liu 5e1a9d48dd [cluster] add process group mesh (#4039)
1 year ago
Tian Siyuan ff836790ae
[doc] fix a typo in examples/tutorial/auto_parallel/README.md (#4430)
1 year ago
Wenhao Chen 6d41c3f2aa
[doc] update Coati README (#4405)
1 year ago
LuGY d86ddd9b29
[hotfix] fix unsafe async comm in zero (#4404)
1 year ago
Baizhou Zhang 6ccecc0c69
[gemini] fix tensor storage cleaning in state dict collection (#4396)
1 year ago
flybird1111 458ae331ad
[kernel] updated unittests for coloattention (#4389)
1 year ago
binmakeswell 089c365fa0
[doc] add Series A Funding and NeurIPS news (#4377)
1 year ago
flybird1111 f40b718959
[doc] Fix gradient accumulation doc. (#4349)
1 year ago
flybird1111 38b792aab2
[coloattention] fix import error (#4380)
1 year ago
flybird1111 25c57b9fb4
[fix] coloattention support flash attention 2 (#4347)
1 year ago
Wenhao Chen da4f7b855f
[chat] fix bugs and add unit tests (#4213)
1 year ago
Hongxin Liu 16bf4c0221
[test] remove useless tests (#4359)
1 year ago
caption 16c0acc01b
[hotfix] update gradio 3.11 to 3.34.0 (#4329)
1 year ago
Hongxin Liu 806477121d
[release] update version (#4332)
1 year ago
Wenhao Chen 75c5389037
[chat] fix compute_approx_kl (#4338)
1 year ago
LuGY 03654c0ce2
fix localhost measurement (#4320)
1 year ago
LuGY 45b08f08cb [zero] optimize the optimizer step time (#4221)
1 year ago
LuGY 1a49a5ea00 [zero] support shard optimizer state dict of zero (#4194)
1 year ago
LuGY dd7cc58299 [zero] add state dict for low level zero (#4179)
1 year ago
LuGY c668801d36 [zero] allow passing process group to zero12 (#4153)
1 year ago