Commit Graph

10 Commits (d4c5ef441e4935e1b67d70936adbab1f6febe4ef)

Author SHA1 Message Date
Wenhao Chen 7172459e74
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
* [shardformer] implement policy for all GPT-J models and test

* [shardformer] support interleaved pipeline parallel for bert finetune

* [shardformer] shardformer support falcon (#4883)

* [shardformer]: fix interleaved pipeline for bert model (#5048)

* [hotfix]: disable seq parallel for gptj and falcon, and polish code (#5093)

* Add Mistral support for Shardformer (#5103)

* [shardformer] add tests to mistral (#5105)

---------

Co-authored-by: Pengtai Xu <henryxu880@gmail.com>
Co-authored-by: ppt0011 <143150326+ppt0011@users.noreply.github.com>
Co-authored-by: flybird11111 <1829166702@qq.com>
Co-authored-by: eric8607242 <e0928021388@gmail.com>
2023-11-28 16:54:42 +08:00
Jianghai 5545114fd8
rename chatglm to chatglm2 (#4484) 2023-08-22 14:13:31 +08:00
FoolPlayer 879301d0da [shardformer] support Blip2 (#4243)
* support base blip2

* add support for downstream blip2 model

* update readme

* add forward injection

* skip not compatible models test

* fix test for gemini and low_level_zero_pugin
2023-08-15 23:25:14 +08:00
Kun Lin ed34bb1310 Feature/chatglm (#4240)
* [shardformer] added tests

* [shardformer] vit test finish and support

* [shardformer] chatglm ready

* import chatglm

* [shardformer] add test kit in model zoo for chatglm

* [sharformer] add first version of policy of chatglm

* [shardformer] polish chatglm code

* [shardformer] polish code

* [shardformer] support chatglm without layernorm

* [shardformer] chatglm shard without mlp sharding

* [shardformer] delete some file

* [shardformer] ChatGLM support layernorm sharding

* [shardformer] register without auto policy

* [shardformer] pre-commit check files

* [shardformer] fix chatglm configuration with pre-commit
2023-08-15 23:25:14 +08:00
FoolPlayer 9ee4ebea83 [shardformer] support whisper (#4212)
* support whisper

* fix bug in vocabembedding

* support downstream model of whisper

* update readme
2023-08-15 23:25:14 +08:00
FoolPlayer dd2bf02679 [shardformer] support SAM (#4231)
* 1.support sam 2.add fused qkv for nn.Linear

* update utils support set element in list

* overtwrite SamVisionAttention foward to use DropoutForParallelInput

* remove unused code
2023-08-15 23:25:14 +08:00
FoolPlayer b3f5d7a3ba [shardformer] support pipeline base vit model (#4284)
* Feature/vit support (#4182)

* [shardformer] added tests

* [shardformer] vit test finish and support

* fix attention dropout

* support base vit pipeline

* support vit downstream model

* fix vit shard test

* modify hidden states return type

---------

Co-authored-by: Kun Lin <81014421+klhhhhh@users.noreply.github.com>
2023-08-15 23:25:14 +08:00
Frank Lee b1c2901530 [shardformer] supported bloom model (#4098) 2023-07-04 16:05:01 +08:00
Frank Lee 58df720570 [shardformer] adapted T5 and LLaMa test to use kit (#4049)
* [shardformer] adapted T5 and LLaMa test to use kit

* polish code
2023-07-04 16:05:01 +08:00
Frank Lee 6d48eb0560
[test] added transformers models to test model zoo (#3135) 2023-03-15 11:26:10 +08:00