LuGY
|
02b187c14f
|
[zero] add sampling time for memstats collector (#610)
|
2022-04-01 14:03:00 +08:00 |
Jiarui Fang
|
e956d93ac2
|
[refactor] memory utils (#577)
|
2022-04-01 09:22:33 +08:00 |
HELSON
|
e6d50ec107
|
[zero] adapt zero for unsharded parameters (#561)
* support existing sharded and unsharded parameters in zero
* add unitest for moe-zero model init
* polish moe gradient handler
|
2022-03-31 18:34:11 +08:00 |
Jiarui Fang
|
7675366fce
|
[polish] rename col_attr -> colo_attr (#558)
|
2022-03-31 12:25:45 +08:00 |
Liang Bowen
|
2c45efc398
|
html refactor (#555)
|
2022-03-31 11:36:56 +08:00 |
Jiarui Fang
|
107b99ddb1
|
[zero] dump memory stats for sharded model (#548)
|
2022-03-30 09:38:44 +08:00 |
Jiarui Fang
|
53b1b6e340
|
[zero] non model data tracing (#545)
|
2022-03-29 15:45:48 +08:00 |
Jie Zhu
|
73d36618a6
|
[profiler] add MemProfiler (#356)
* add memory trainer hook
* fix bug
* add memory trainer hook
* fix import bug
* fix import bug
* add trainer hook
* fix #370 git log bug
* modify `to_tensorboard` function to support better output
* remove useless output
* change the name of `MemProfiler`
* complete memory profiler
* replace error with warning
* finish trainer hook
* modify interface of MemProfiler
* modify `__init__.py` in profiler
* remove unnecessary pass statement
* add usage to doc string
* add usage to trainer hook
* new location to store temp data file
|
2022-03-29 12:48:34 +08:00 |
Jiarui Fang
|
c11ff81b15
|
[zero] get memory usage of sharded optim v2. (#542)
|
2022-03-29 09:08:18 +08:00 |
Jiarui Fang
|
705f56107c
|
[zero] refactor model data tracing (#537)
|
2022-03-28 16:38:18 +08:00 |
Jiarui Fang
|
8d8c5407c0
|
[zero] refactor model data tracing (#522)
|
2022-03-25 18:03:32 +08:00 |
Jiarui Fang
|
0bebda6ea5
|
[zero] fix init device bug in zero init context unittest (#516)
|
2022-03-25 12:24:18 +08:00 |
Jiarui Fang
|
7ef3507ace
|
[zero] show model data cuda memory usage after zero context init. (#515)
|
2022-03-25 11:23:35 +08:00 |
Jiarui Fang
|
9330be0f3c
|
[memory] set cuda mem frac (#506)
|
2022-03-24 16:57:13 +08:00 |
Jiarui Fang
|
0035b7be07
|
[memory] add model data tensor moving api (#503)
|
2022-03-24 14:29:41 +08:00 |
Jiarui Fang
|
a445e118cf
|
[polish] polish singleton and global context (#500)
|
2022-03-23 18:03:39 +08:00 |
Frank Lee
|
b03b3ae99c
|
fixed mem monitor device (#433)
fixed mem monitor device
|
2022-03-16 15:25:02 +08:00 |
Jiarui Fang
|
56bb412e72
|
[polish] use GLOBAL_MODEL_DATA_TRACER (#417)
|
2022-03-15 11:29:46 +08:00 |
Jiarui Fang
|
21dc54e019
|
[zero] memtracer to record cuda memory usage of model data and overall system (#395)
|
2022-03-14 22:05:30 +08:00 |
Jiarui Fang
|
ea2872073f
|
[zero] global model data memory tracer (#360)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
10e2826426
|
move async memory to an individual directory (#345)
|
2022-03-11 15:50:28 +08:00 |