FoolPlayer
4021b9a8a2
[shardformer] add gpt2 test and layer class refactor ( #4041 )
...
* add gpt2 test and layer class refactor
* add dropout in gpt2 policy
2023-07-04 16:05:01 +08:00
Frank Lee
d857f3dbba
[shardformer] supported T5 and its variants ( #4045 )
2023-07-04 16:05:01 +08:00
FoolPlayer
507c0ad368
add vocabembedding layer
2023-07-04 16:05:01 +08:00
Frank Lee
45d9384346
[shardformer] removed inplace tensor sharding ( #4018 )
2023-07-04 16:05:01 +08:00
Frank Lee
3893fa1a8d
[shardformer] refactored embedding and dropout to parallel module ( #4013 )
...
* [shardformer] refactored embedding and dropout to parallel module
* polish code
2023-07-04 16:05:01 +08:00
Frank Lee
015af592f8
[shardformer] integrated linear 1D with dtensor ( #3996 )
...
* [shardformer] integrated linear 1D with dtensor
* polish code
2023-07-04 16:05:01 +08:00
wukong1992
c1c672d0f0
[shardformer] shardformer support t5 model ( #3994 )
...
test t5
2023-07-04 16:05:01 +08:00
wukong1992
6b30dfb7ce
[shardformer] support llama model using shardformer ( #3969 )
...
adjust layer attr
2023-07-04 16:05:01 +08:00
FoolPlayer
a73130482d
[shardformer] Unit test ( #3928 )
...
* fix bug in slicer, add slicer unit test
* add dropout test
* use pid as dropout seed
* updata dropout test with local pattern
* ad todo
2023-07-04 16:05:01 +08:00
FoolPlayer
f1cb5ac6bf
[shardformer] Align bert value ( #3907 )
...
* add bert align test, fix dist loss bug
* forward and backward align
* add ignore index
* add shardformer CI
* add gather_output optional for user in shardconfig
* update readme with optional gather_ouput
* add dist crossentropy loss test, remove unused files
* remove unused file
* remove unused file
* rename the file
* polish code
2023-07-04 16:05:01 +08:00
FoolPlayer
ab8a47f830
[shardformer] add Dropout layer support different dropout pattern ( #3856 )
...
* add dropout layer, add dropout test
* modify seed manager as context manager
* add a copy of col_nn.layer
* add dist_crossentropy loss; separate module test
* polish the code
* fix dist crossentropy loss
2023-07-04 16:05:01 +08:00
Frank Lee
ddcf58cacf
Revert "[sync] sync feature/shardformer with develop"
2023-06-09 09:41:27 +08:00
FoolPlayer
21a3915c98
[shardformer] add Dropout layer support different dropout pattern ( #3856 )
...
* add dropout layer, add dropout test
* modify seed manager as context manager
* add a copy of col_nn.layer
* add dist_crossentropy loss; separate module test
* polish the code
* fix dist crossentropy loss
2023-06-08 15:01:34 +08:00