ver217
|
9506a8beb2
|
use double buffer to handle grad
|
2022-03-16 14:24:09 +08:00 |
ver217
|
1388671699
|
[zero] Update sharded model v2 using sharded param v2 (#323)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
11bddb6e55
|
[zero] update zero context init with the updated test utils (#327)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
de0468c7a8
|
[zero] zero init context (#321)
* add zero init context
* add more flags for zero init context
fix bug of repeated converting param to ShardedParamV2
* polish code
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
90d3aef62c
|
[zero] yet an improved sharded param (#311)
|
2022-03-11 15:50:28 +08:00 |
ver217
|
36f9a74ab2
|
fix sharded param hook and unit test
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
80364c7686
|
[zero] sharded tensor (#305)
* init shard param from shape tuple
* add more unitest for shard param
* add set_payload method for ShardedParam
* [zero] add shareded tensor class
* polish code
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
e17e92c54d
|
Polish sharded parameter (#297)
* init shard param from shape tuple
* add more unitest for shard param
* add more unittests to shareded param
|
2022-03-11 15:50:28 +08:00 |