Commit Graph

6 Commits (6b30dfb7ce002be3acc0668d3fa44c4d4ebb4108)

Author SHA1 Message Date
wukong1992 6b30dfb7ce [shardformer] support llama model using shardformer (#3969)
adjust layer attr
2023-07-04 16:05:01 +08:00
FoolPlayer a73130482d [shardformer] Unit test (#3928)
* fix bug in slicer, add slicer unit test

* add dropout test

* use pid as dropout seed

* updata dropout test with local pattern

* ad todo
2023-07-04 16:05:01 +08:00
FoolPlayer f1cb5ac6bf [shardformer] Align bert value (#3907)
* add bert align test, fix dist loss bug

* forward and backward align

* add ignore index

* add shardformer CI

* add gather_output optional for user in shardconfig

* update readme with optional gather_ouput

* add dist crossentropy loss test, remove unused files

* remove unused file

* remove unused file

* rename the file

* polish code
2023-07-04 16:05:01 +08:00
FoolPlayer ab8a47f830 [shardformer] add Dropout layer support different dropout pattern (#3856)
* add dropout layer, add dropout test

* modify seed manager as context manager

* add a copy of col_nn.layer

* add dist_crossentropy loss; separate module test

* polish the code

* fix dist crossentropy loss
2023-07-04 16:05:01 +08:00
Frank Lee ddcf58cacf
Revert "[sync] sync feature/shardformer with develop" 2023-06-09 09:41:27 +08:00
FoolPlayer 21a3915c98 [shardformer] add Dropout layer support different dropout pattern (#3856)
* add dropout layer, add dropout test

* modify seed manager as context manager

* add a copy of col_nn.layer

* add dist_crossentropy loss; separate module test

* polish the code

* fix dist crossentropy loss
2023-06-08 15:01:34 +08:00