Hongxin Liu
079bf3cb26
[misc] update pre-commit and run all files ( #4752 )
...
* [misc] update pre-commit
* [misc] run pre-commit
* [misc] remove useless configuration files
* [misc] ignore cuda for clang-format
2023-09-19 14:20:26 +08:00
Baizhou Zhang
0ceec8f9a9
[pipeline] support fp32 for HybridPlugin/merge shardformer test and pipeline test into one file ( #4354 )
...
* add naive optimizer for 3DPlugin/refactor gpt2 shardformer test
* merge tests of PP/DP/TP combinations into one test file
* fix bug when sync grad for dp in HybridPlugin
* update supported precisions for 3DPlugin/fix bug when shifting tp_degree
* improve the passing of lazy_init
* modify lazy_init/use sync_shared_params
2023-08-15 23:25:14 +08:00
Hongxin Liu
d921ce8391
[shardformer] support inplace sharding ( #4251 )
...
* [shardformer] embedding support inplace sharding
* [shardformer] linear support inplace sharding
* [shardformer] layernorm support inplace sharding
* [shardformer] qkv support inplace sharding
* [test] update shardformer layer test
* [shardformer] fix shared param sharding
* [shardformer] fix bert policy
* [shardformer] fix bloom policy
* [shardformer] fix llama policy
* [shardformer] fix opt policy
* [shardformer] fix t5 policy
* [shardformer] fix fused qkv linear
* [shardformer] fix bugs
* force sync
* [test] fix bugs
* [test] fix transformer version
2023-08-15 23:25:14 +08:00
Frank Lee
70c58cfd4f
[shardformer] supported fused qkv checkpoint ( #4073 )
2023-07-04 16:05:01 +08:00
Frank Lee
8eb09a4c69
[shardformer] support module saving and loading ( #4062 )
...
* [shardformer] support module saving and loading
* polish code
2023-07-04 16:05:01 +08:00
Frank Lee
45d9384346
[shardformer] removed inplace tensor sharding ( #4018 )
2023-07-04 16:05:01 +08:00
Frank Lee
015af592f8
[shardformer] integrated linear 1D with dtensor ( #3996 )
...
* [shardformer] integrated linear 1D with dtensor
* polish code
2023-07-04 16:05:01 +08:00