Commit Graph

27 Commits (6f7d1362c901748ca9f005dc96388605aa195af9)

Author SHA1 Message Date
Jiarui Fang 193dc8dacb
[refactor] refactor the memory utils (#715)
3 years ago
HELSON d7ecaf362b
[zero] fix init bugs in zero context (#686)
3 years ago
Jiarui Fang 53b1b6e340
[zero] non model data tracing (#545)
3 years ago
ver217 1f90a3b129
[zero] polish ZeroInitContext (#540)
3 years ago
Jiarui Fang c11ff81b15
[zero] get memory usage of sharded optim v2. (#542)
3 years ago
HELSON a30e2b4c24
[zero] adapt for no-leaf module in zero (#535)
3 years ago
Frank Lee 3601b2bad0
[test] fixed rerun_on_exception and adapted test cases (#487)
3 years ago
ver217 9ec1ce6ab1
[zero] sharded model support the reuse of fp16 shard (#495)
3 years ago
ver217 62b0a8d644
[zero] sharded optim support hybrid cpu adam (#486)
3 years ago
Frank Lee af185b5519
[test] fixed amp convergence comparison test (#454)
3 years ago
ver217 642846d6f9
update sharded optim and fix zero init ctx (#457)
3 years ago
Jiarui Fang e2e9f82588
Revert "[zero] update sharded optim and fix zero init ctx" (#456)
3 years ago
ver217 8cf7ff08cf polish code
3 years ago
ver217 57567ee768 update sharded optim and fix zero init ctx
3 years ago
Frank Lee f27d801a13
[test] optimized zero data parallel test (#452)
3 years ago
Jiarui Fang 0fcfb1e00d
[test] make zero engine test really work (#447)
3 years ago
Jiarui Fang f9c762df85
[test] merge zero optim tests (#428)
3 years ago
Jiarui Fang adebb3e041
[zero] cuda margin space for OS (#418)
3 years ago
Jiarui Fang 23ba3fc450
[zero] refactory ShardedOptimV2 init method (#416)
3 years ago
Jiarui Fang 21dc54e019
[zero] memtracer to record cuda memory usage of model data and overall system (#395)
3 years ago
Jiarui Fang 370f567e7d
[zero] new interface for ShardedOptimv2 (#406)
3 years ago
ver217 54fd37f0e0 polish unit test
3 years ago
Jiarui Fang 3af13a2c3e [zero] polish ShardedOptimV2 unittest (#385)
3 years ago
ver217 d0ae0f2215 [zero] update sharded optim v2 (#334)
3 years ago
ver217 1388671699 [zero] Update sharded model v2 using sharded param v2 (#323)
3 years ago
ver217 36f9a74ab2 fix sharded param hook and unit test
3 years ago
ver217 001ca624dd impl shard optim v2 and add unit test
3 years ago