Jiarui Fang
|
8d8c5407c0
|
[zero] refactor model data tracing (#522)
|
2022-03-25 18:03:32 +08:00 |
Frank Lee
|
3601b2bad0
|
[test] fixed rerun_on_exception and adapted test cases (#487)
|
2022-03-25 17:25:12 +08:00 |
Jiarui Fang
|
0bebda6ea5
|
[zero] fix init device bug in zero init context unittest (#516)
|
2022-03-25 12:24:18 +08:00 |
Jiarui Fang
|
b334822163
|
[zero] polish sharded param name (#484)
* [zero] polish sharded param name
* polish code
* polish
* polish code
* polish
* polsih
* polish
|
2022-03-22 14:36:16 +08:00 |
ver217
|
a241f61b34
|
[zero] Update initialize for ZeRO (#458)
* polish code
* shard strategy receive pg in shard() / gather()
* update zero engine
* polish code
|
2022-03-18 16:18:31 +08:00 |
Frank Lee
|
f27d801a13
|
[test] optimized zero data parallel test (#452)
|
2022-03-18 11:35:54 +08:00 |
Jiarui Fang
|
56bb412e72
|
[polish] use GLOBAL_MODEL_DATA_TRACER (#417)
|
2022-03-15 11:29:46 +08:00 |
Jiarui Fang
|
21dc54e019
|
[zero] memtracer to record cuda memory usage of model data and overall system (#395)
|
2022-03-14 22:05:30 +08:00 |
ver217
|
54fd37f0e0
|
polish unit test
|
2022-03-14 15:06:02 +08:00 |
Jiarui Fang
|
6b6002962a
|
[zero] zero init context collect numel of model (#375)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
44e4891f57
|
[zero] able to place params on cpu after zero init context (#365)
* place params on cpu after zero init context
* polish code
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
ea2872073f
|
[zero] global model data memory tracer (#360)
|
2022-03-11 15:50:28 +08:00 |
ver217
|
1388671699
|
[zero] Update sharded model v2 using sharded param v2 (#323)
|
2022-03-11 15:50:28 +08:00 |
jiaruifang
|
dec24561cf
|
show pytest parameterize
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
11bddb6e55
|
[zero] update zero context init with the updated test utils (#327)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
de0468c7a8
|
[zero] zero init context (#321)
* add zero init context
* add more flags for zero init context
fix bug of repeated converting param to ShardedParamV2
* polish code
|
2022-03-11 15:50:28 +08:00 |