FoolPlayer
|
d3bc530849
|
[shardformer] Refactor shardformer api (#4001)
* fix an error in readme
* simplify code
* refactor shardformer
* add todo
* remove slicer
* resolve code review
|
2023-07-04 16:05:01 +08:00 |
FoolPlayer
|
f7774ec0f3
|
[Shardformer] Downstream bert (#3979)
* add dist dropout in model
* update docstring and bert policy with dropout
* refactor basepolicy and sharded, update bert
* update format
* update gpt2 policy
* update bert policy
* remove unused code
* update readme for new policy usage
* add downstream model of bert
* remove unused code
|
2023-07-04 16:05:01 +08:00 |
FoolPlayer
|
f1cb5ac6bf
|
[shardformer] Align bert value (#3907)
* add bert align test, fix dist loss bug
* forward and backward align
* add ignore index
* add shardformer CI
* add gather_output optional for user in shardconfig
* update readme with optional gather_ouput
* add dist crossentropy loss test, remove unused files
* remove unused file
* remove unused file
* rename the file
* polish code
|
2023-07-04 16:05:01 +08:00 |
Frank Lee
|
4972e1f40e
|
[shardformer] refactored the user api (#3828)
* [shardformer] refactored the user api
* polish code
|
2023-07-04 16:05:01 +08:00 |
Frank Lee
|
ddcf58cacf
|
Revert "[sync] sync feature/shardformer with develop"
|
2023-06-09 09:41:27 +08:00 |
Frank Lee
|
537a52b7a2
|
[shardformer] refactored the user api (#3828)
* [shardformer] refactored the user api
* polish code
|
2023-06-08 15:01:34 +08:00 |