Commit Graph

8 Commits (1edc9b5fb3f8a9c9ec5d71a62bb33914a0d5f0c4)

Author SHA1 Message Date
FoolPlayer 879301d0da [shardformer] support Blip2 (#4243)
* support base blip2

* add support for downstream blip2 model

* update readme

* add forward injection

* skip not compatible models test

* fix test for gemini and low_level_zero_pugin
2023-08-15 23:25:14 +08:00
Kun Lin ed34bb1310 Feature/chatglm (#4240)
* [shardformer] added tests

* [shardformer] vit test finish and support

* [shardformer] chatglm ready

* import chatglm

* [shardformer] add test kit in model zoo for chatglm

* [sharformer] add first version of policy of chatglm

* [shardformer] polish chatglm code

* [shardformer] polish code

* [shardformer] support chatglm without layernorm

* [shardformer] chatglm shard without mlp sharding

* [shardformer] delete some file

* [shardformer] ChatGLM support layernorm sharding

* [shardformer] register without auto policy

* [shardformer] pre-commit check files

* [shardformer] fix chatglm configuration with pre-commit
2023-08-15 23:25:14 +08:00
FoolPlayer 9ee4ebea83 [shardformer] support whisper (#4212)
* support whisper

* fix bug in vocabembedding

* support downstream model of whisper

* update readme
2023-08-15 23:25:14 +08:00
FoolPlayer dd2bf02679 [shardformer] support SAM (#4231)
* 1.support sam 2.add fused qkv for nn.Linear

* update utils support set element in list

* overtwrite SamVisionAttention foward to use DropoutForParallelInput

* remove unused code
2023-08-15 23:25:14 +08:00
FoolPlayer b3f5d7a3ba [shardformer] support pipeline base vit model (#4284)
* Feature/vit support (#4182)

* [shardformer] added tests

* [shardformer] vit test finish and support

* fix attention dropout

* support base vit pipeline

* support vit downstream model

* fix vit shard test

* modify hidden states return type

---------

Co-authored-by: Kun Lin <81014421+klhhhhh@users.noreply.github.com>
2023-08-15 23:25:14 +08:00
Frank Lee b1c2901530 [shardformer] supported bloom model (#4098) 2023-07-04 16:05:01 +08:00
Frank Lee 58df720570 [shardformer] adapted T5 and LLaMa test to use kit (#4049)
* [shardformer] adapted T5 and LLaMa test to use kit

* polish code
2023-07-04 16:05:01 +08:00
Frank Lee 6d48eb0560
[test] added transformers models to test model zoo (#3135) 2023-03-15 11:26:10 +08:00