Commit Graph

111 Commits (2d642eea0f92c7f7c1fb7bef3abdfdb0cb61d1bf)

Author SHA1 Message Date
Wang Binluo 0d0a582033
[shardformer] update transformers (#5583)
7 months ago
flybird11111 a0ad587c24
[shardformer] refactor embedding resize (#5603)
7 months ago
Zhongkai Zhao 8e412a548e
[shardformer] Sequence Parallelism Optimization (#5533)
8 months ago
Wenhao Chen e614aa34f3
[shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous shard policy for llama (#5508)
8 months ago
github-actions[bot] e6707a6e8d
[format] applied code formatting on changed files in pull request 5510 (#5517)
8 months ago
Hongxin Liu 19e1a5cf16
[shardformer] update colo attention to support custom mask (#5510)
8 months ago
flybird11111 0688d92e2d
[shardformer]Fix lm parallel. (#5480)
8 months ago
flybird11111 5e16bf7980
[shardformer] fix gathering output when using tensor parallelism (#5431)
8 months ago
digger yu 049121d19d
[hotfix] fix typo change enabel to enable under colossalai/shardformer/ (#5317)
9 months ago
flybird11111 29695cf70c
[example]add gpt2 benchmark example script. (#5295)
9 months ago
flybird11111 0a25e16e46
[shardformer]gather llama logits (#5398)
9 months ago
flybird11111 388179f966
[tests] fix t5 test. (#5322)
10 months ago
Frank Lee 7cfed5f076
[feat] refactored extension module (#5298)
10 months ago
ver217 148469348a Merge branch 'main' into sync/npu
10 months ago
Xuanlei Zhao dd2c28a323
[npu] use extension for op builder (#5172)
11 months ago
flybird11111 451e9142b8
fix flash attn (#5209)
11 months ago
flybird11111 79718fae04
[shardformer] llama support DistCrossEntropy (#5176)
12 months ago
Xuanlei Zhao d6df19bae7
[npu] support triangle attention for llama (#5130)
1 year ago
Wenhao Chen 7172459e74
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
1 year ago
Xuanlei Zhao 68fcaa2225
remove duplicate import (#5100)
1 year ago
flybird11111 aae496631c
[shardformer]fix flash attention, when mask is casual, just don't unpad it (#5084)
1 year ago
Bin Jia 4e3959d316
[hotfix/hybridengine] Fix init model with random parameters in benchmark (#5074)
1 year ago
flybird11111 97cd0cd559
[shardformer] fix llama error when transformers upgraded. (#5055)
1 year ago
Elsa Granger b2ad0d9e8f
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark (#5017)
1 year ago
flybird11111 576a2f7b10
[gemini] gemini support tensor parallelism. (#4942)
1 year ago
Jianghai ef4c14a5e2
[Inference] Fix bug in ChatGLM2 Tensor Parallelism (#5014)
1 year ago
Hongxin Liu 1f5d2e8062
[hotfix] fix torch 2.0 compatibility (#4936)
1 year ago
Xu Kai d1fcc0fa4d
[infer] fix test bug (#4838)
1 year ago
Jianghai ce7ade3882
[inference] chatglm2 infer demo (#4724)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
flybird11111 c7d6975d29
[shardformer] fix GPT2DoubleHeadsModel (#4703)
1 year ago
Cuiqing Li bce0f16702
[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577)
1 year ago
flybird11111 eedaa3e1ef
[shardformer]fix gpt2 double head (#4663)
1 year ago
flybird11111 7486ed7d3a
[shardformer] update llama2/opt finetune example and fix llama2 policy (#4645)
1 year ago
Baizhou Zhang 295b38fecf
[example] update vit example for hybrid parallel plugin (#4641)
1 year ago
flybird11111 3353e55c80
[shardformer] vit/llama/t5 ignore the sequence parallelism flag and some fix. (#4498)
1 year ago
flybird11111 59e252ecdb
[shardformer] chatglm support sequence parallel (#4482)
1 year ago
Bin Jia 351351a36e
[shardformer/sequence parallel] not support opt of seq-parallel, add warning and fix a bug in gpt2 pp (#4488)
1 year ago
Jianghai 5545114fd8
rename chatglm to chatglm2 (#4484)
1 year ago
Jianghai 8739aa7fa0
[shardformer] Pipeline/whisper (#4456)
1 year ago
flybird11111 a27e0bb494
[shardformer] bert support sequence parallel. (#4455)
1 year ago
flybird11111 0ecd71e041
[shardformer] bloom support sequence parallel (#4465)
1 year ago
Bin Jia 7c8be77081
[shardformer/sequence parallel] support gpt2 seq parallel with pp/dp/tp (#4460)
1 year ago
Bin Jia 424629fea0
[shardformer/sequence parallel] Cherry pick commit to new branch (#4450)
1 year ago
Hongxin Liu 172f7fa3cf [misc] resolve code factor issues (#4433)
1 year ago
flybird11111 1edc9b5fb3 [shardformer] update tests for all optimization (#4413)
1 year ago
Baizhou Zhang 7711bd524a [shardformer] rewrite tests for opt/bloom/llama/vit/chatglm (#4395)
1 year ago
flybird1111 7a3dfd0c64 [shardformer] update shardformer to use flash attention 2 (#4392)
1 year ago
Baizhou Zhang ed4c448488 [pipeline] rewrite t5 tests & support multi-tensor transmitting in pipeline (#4388)
1 year ago
flybird1111 906426cb44 [Shardformer] Merge flash attention branch to pipeline branch (#4362)
1 year ago
Jianghai a88e92251d [pipeline] add chatglm (#4363)
1 year ago
FoolPlayer 879301d0da [shardformer] support Blip2 (#4243)
1 year ago
FoolPlayer dd2bf02679 [shardformer] support SAM (#4231)
1 year ago
Baizhou Zhang da3cef27ad [pipeline] fix return_dict/fix pure_pipeline_test (#4331)
1 year ago
Hongxin Liu 261eab02fb [plugin] add 3d parallel plugin (#4295)
1 year ago
FoolPlayer b3f5d7a3ba [shardformer] support pipeline base vit model (#4284)
1 year ago
Baizhou Zhang 083d7da33d [pipeline] add pipeline support for all T5 models (#4310)
1 year ago
Baizhou Zhang 36e546b2cc [pipeline] add pipeline support for T5Stack/T5EncoderModel (#4300)
1 year ago
Jianghai 18ebcf406a [pipeline] reformat for unified design (#4283)
1 year ago
Baizhou Zhang b774d5ea0f [pipeline] refactor gpt2 pipeline forwards (#4287)
1 year ago
Frank Lee 89f45eda5a [shardformer] added development protocol for standardization (#4149)
1 year ago