Commit Graph

6 Commits (6b30dfb7ce002be3acc0668d3fa44c4d4ebb4108)

Author SHA1 Message Date
FoolPlayer 45927d5527 [shardformer] Add dropout layer in shard model and refactor policy api (#3949)
* add dist dropout in model

* update docstring and bert policy with dropout

* refactor basepolicy and sharded, update bert

* update format

* update gpt2 policy

* update bert policy

* remove unused code

* update readme for new policy usage
2023-07-04 16:05:01 +08:00
FoolPlayer 8cc11235c0 [shardformer]: Feature/shardformer, add some docstring and readme (#3816)
* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example

* add share weight and train example

* add train

* add docstring and readme

* add docstring for other files

* pre-commit
2023-07-04 16:05:01 +08:00
FoolPlayer 8d68de767d [shardformer] init shardformer code structure (#3731)
* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example
2023-07-04 16:05:01 +08:00
Frank Lee ddcf58cacf
Revert "[sync] sync feature/shardformer with develop" 2023-06-09 09:41:27 +08:00
FoolPlayer 58f6432416 [shardformer]: Feature/shardformer, add some docstring and readme (#3816)
* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example

* add share weight and train example

* add train

* add docstring and readme

* add docstring for other files

* pre-commit
2023-06-08 15:01:34 +08:00
FoolPlayer 6a69b44dfc [shardformer] init shardformer code structure (#3731)
* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example
2023-06-08 15:01:34 +08:00