4 Commits (ckpt)

Author SHA1 Message Date
Hongxin Liu aa125bcc91
[shardformer] fix modeling of bloom and falcon (#5796) 5 months ago
Haze188 22ce873c3f
[Shardformer] Add parallel output for shardformer models(bloom, falcon) (#5702) 6 months ago
Wang Binluo 0d0a582033
[shardformer] update transformers (#5583) 7 months ago
Wenhao Chen 7172459e74
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088) 1 year ago
flybird11111 576a2f7b10
[gemini] gemini support tensor parallelism. (#4942) 1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752) 1 year ago
flybird11111 0ecd71e041
[shardformer] bloom support sequence parallel (#4465) 1 year ago
Hongxin Liu 172f7fa3cf [misc] resolve code factor issues (#4433) 1 year ago
flybird1111 906426cb44 [Shardformer] Merge flash attention branch to pipeline branch (#4362) 1 year ago
Baizhou Zhang da3cef27ad [pipeline] fix return_dict/fix pure_pipeline_test (#4331) 1 year ago
Jianghai 18ebcf406a [pipeline] reformat for unified design (#4283) 1 year ago
Frank Lee 89f45eda5a [shardformer] added development protocol for standardization (#4149) 1 year ago