ver217
|
642846d6f9
|
update sharded optim and fix zero init ctx (#457)
|
2022-03-18 15:44:47 +08:00 |
Jiarui Fang
|
e2e9f82588
|
Revert "[zero] update sharded optim and fix zero init ctx" (#456)
* Revert "polish code"
This reverts commit 8cf7ff08cf .
* Revert "rename variables"
This reverts commit e99af94ab8 .
* Revert "remove surplus imports"
This reverts commit 46add4a5c5 .
* Revert "update sharded optim and fix zero init ctx"
This reverts commit 57567ee768 .
|
2022-03-18 15:22:43 +08:00 |
ver217
|
57567ee768
|
update sharded optim and fix zero init ctx
|
2022-03-18 14:25:25 +08:00 |
ver217
|
9506a8beb2
|
use double buffer to handle grad
|
2022-03-16 14:24:09 +08:00 |
Jiarui Fang
|
56bb412e72
|
[polish] use GLOBAL_MODEL_DATA_TRACER (#417)
|
2022-03-15 11:29:46 +08:00 |
Jiarui Fang
|
21dc54e019
|
[zero] memtracer to record cuda memory usage of model data and overall system (#395)
|
2022-03-14 22:05:30 +08:00 |
Jiarui Fang
|
272ebfb57d
|
[bug] shard param during initializing the ShardedModelV2 (#381)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
6b6002962a
|
[zero] zero init context collect numel of model (#375)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
44e4891f57
|
[zero] able to place params on cpu after zero init context (#365)
* place params on cpu after zero init context
* polish code
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
ea2872073f
|
[zero] global model data memory tracer (#360)
|
2022-03-11 15:50:28 +08:00 |
ver217
|
1388671699
|
[zero] Update sharded model v2 using sharded param v2 (#323)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
11bddb6e55
|
[zero] update zero context init with the updated test utils (#327)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
de0468c7a8
|
[zero] zero init context (#321)
* add zero init context
* add more flags for zero init context
fix bug of repeated converting param to ShardedParamV2
* polish code
|
2022-03-11 15:50:28 +08:00 |