Hongxin Liu
890774b2fb
[shardformer] support lazy init ( #4202 )
...
* [shardformer] support lazy init
* [shardformer] linear support lazy init
* [shardformer] embedding support lazy init
* [shardformer] norm support lazy init
* [shardformer] fused linear support lazy init
* [test] update shardformer test layer
* [test] shardformer with lazy init fit ddp
* [lazy] hotfix deepcopy of param
* [shardformer] fix bert policy and update test
* [shardformer] fix bloom policy and update test
* [shardformer] fix opt policy and update test
* [shardformer] fix t5 policy and update test
* [shardformer] fix gpt2 policy and update test
* [shardformer] fix llama policy and update test
2023-08-15 23:25:14 +08:00
ver217
1ed3f8a24f
[shardformer] rename policy file name
2023-08-15 23:25:14 +08:00
Frank Lee
1fb0d95df0
[shardformer] made tensor parallelism configurable ( #4144 )
...
* [shardformer] made tensor parallelism configurable
* polish code
2023-07-04 16:05:01 +08:00
Frank Lee
74257cb446
[shardformer] refactored some doc and api ( #4137 )
...
* [shardformer] refactored some doc and api
* polish code
2023-07-04 16:05:01 +08:00
Frank Lee
44a190e6ac
[shardformer] import huggingface implicitly ( #4101 )
2023-07-04 16:05:01 +08:00
Frank Lee
f3b6aaa6b7
[shardformer] supported fused normalization ( #4112 )
2023-07-04 16:05:01 +08:00
Frank Lee
b1c2901530
[shardformer] supported bloom model ( #4098 )
2023-07-04 16:05:01 +08:00
FoolPlayer
0803a61412
[shardformer] add linearconv1d test ( #4067 )
...
* add linearconv1d test
* add linearconv1d test
2023-07-04 16:05:01 +08:00
FoolPlayer
7740c55c55
support kit use for bert/gpt test ( #4055 )
...
* support kit use for bert test
* support kit test for gpt2
2023-07-04 16:05:01 +08:00
Frank Lee
f22ddacef0
[shardformer] refactored the shardformer layer structure ( #4053 )
2023-07-04 16:05:01 +08:00
FoolPlayer
4021b9a8a2
[shardformer] add gpt2 test and layer class refactor ( #4041 )
...
* add gpt2 test and layer class refactor
* add dropout in gpt2 policy
2023-07-04 16:05:01 +08:00
FoolPlayer
45927d5527
[shardformer] Add dropout layer in shard model and refactor policy api ( #3949 )
...
* add dist dropout in model
* update docstring and bert policy with dropout
* refactor basepolicy and sharded, update bert
* update format
* update gpt2 policy
* update bert policy
* remove unused code
* update readme for new policy usage
2023-07-04 16:05:01 +08:00
FoolPlayer
79f8d5d54b
[shardformer] add gpt2 policy and modify shard and slicer to support ( #3883 )
...
* add gpt2 policy and modify shard and slicer to support
* remove unused code
* polish code
2023-07-04 16:05:01 +08:00
Frank Lee
ddcf58cacf
Revert "[sync] sync feature/shardformer with develop"
2023-06-09 09:41:27 +08:00
FoolPlayer
ef1537759c
[shardformer] add gpt2 policy and modify shard and slicer to support ( #3883 )
...
* add gpt2 policy and modify shard and slicer to support
* remove unused code
* polish code
2023-06-08 15:01:34 +08:00