Frank Lee
|
70c58cfd4f
|
[shardformer] supported fused qkv checkpoint (#4073)
|
2023-07-04 16:05:01 +08:00 |
Frank Lee
|
8eb09a4c69
|
[shardformer] support module saving and loading (#4062)
* [shardformer] support module saving and loading
* polish code
|
2023-07-04 16:05:01 +08:00 |
Frank Lee
|
f22ddacef0
|
[shardformer] refactored the shardformer layer structure (#4053)
|
2023-07-04 16:05:01 +08:00 |
Frank Lee
|
3893fa1a8d
|
[shardformer] refactored embedding and dropout to parallel module (#4013)
* [shardformer] refactored embedding and dropout to parallel module
* polish code
|
2023-07-04 16:05:01 +08:00 |
Frank Lee
|
015af592f8
|
[shardformer] integrated linear 1D with dtensor (#3996)
* [shardformer] integrated linear 1D with dtensor
* polish code
|
2023-07-04 16:05:01 +08:00 |