Commit Graph

63 Commits (890774b2fba6b5cb737b00466c89334b73a2be69)

Author SHA1 Message Date
Frank Lee 4972e1f40e [shardformer] refactored the user api (#3828)
* [shardformer] refactored the user api

* polish code
2023-07-04 16:05:01 +08:00
Frank Lee 235792f170 [shardformer] updated readme (#3827) 2023-07-04 16:05:01 +08:00
FoolPlayer 8cc11235c0 [shardformer]: Feature/shardformer, add some docstring and readme (#3816)
* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example

* add share weight and train example

* add train

* add docstring and readme

* add docstring for other files

* pre-commit
2023-07-04 16:05:01 +08:00
FoolPlayer 8d68de767d [shardformer] init shardformer code structure (#3731)
* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example
2023-07-04 16:05:01 +08:00
Frank Lee ddcf58cacf
Revert "[sync] sync feature/shardformer with develop" 2023-06-09 09:41:27 +08:00
FoolPlayer ef1537759c [shardformer] add gpt2 policy and modify shard and slicer to support (#3883)
* add gpt2 policy and modify shard and slicer to support

* remove unused code

* polish code
2023-06-08 15:01:34 +08:00
FoolPlayer 6370a935f6 update README (#3909) 2023-06-08 15:01:34 +08:00
FoolPlayer 21a3915c98 [shardformer] add Dropout layer support different dropout pattern (#3856)
* add dropout layer, add dropout test

* modify seed manager as context manager

* add a copy of col_nn.layer

* add dist_crossentropy loss; separate module test

* polish the code

* fix dist crossentropy loss
2023-06-08 15:01:34 +08:00
FoolPlayer 997544c1f9 [shardformer] update readme with modules implement doc (#3834)
* update readme with modules content

* remove img
2023-06-08 15:01:34 +08:00
Frank Lee 537a52b7a2 [shardformer] refactored the user api (#3828)
* [shardformer] refactored the user api

* polish code
2023-06-08 15:01:34 +08:00
Frank Lee bc19024bf9 [shardformer] updated readme (#3827) 2023-06-08 15:01:34 +08:00
FoolPlayer 58f6432416 [shardformer]: Feature/shardformer, add some docstring and readme (#3816)
* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example

* add share weight and train example

* add train

* add docstring and readme

* add docstring for other files

* pre-commit
2023-06-08 15:01:34 +08:00
FoolPlayer 6a69b44dfc [shardformer] init shardformer code structure (#3731)
* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example
2023-06-08 15:01:34 +08:00