Commit Graph

3092 Commits (633e95b301336c4c237537f584882b3d8e5f4145)
 

Author SHA1 Message Date
アマデウス 6302069c0e
[model checkpoint] updated communication ops for cpu tensors (#590)
3 years ago
アマデウス c50bfb807b
[model checkpoint] updated saving/loading for 1d layers (#594)
3 years ago
アマデウス 7636d518e1
[model checkpoint] updated saving/loading for 2d layers (#595)
3 years ago
アマデウス cd13b63832
[model checkpoint] reworked unified layers for ease of save/load states (#593)
3 years ago
アマデウス acae68eb04
[model checkpoint] updated checkpoint save/load utils (#592)
3 years ago
Ziyue Jiang 1c40ee8749
[TP] add assert for tp1d (#621)
3 years ago
ver217 369a288bf3
polish utils docstring (#620)
3 years ago
ver217 e619a651fb
polish optimizer docstring (#619)
3 years ago
ver217 8432dc7080
polish moe docsrting (#618)
3 years ago
ver217 c5b488edf8
polish amp docstring (#616)
3 years ago
ver217 f69507dd22
update rst (#615)
3 years ago
FredHuang99 93f14d2a33
[zero] test zero tensor utils (#609)
3 years ago
ver217 0ef8819c67
polish docstring of zero (#612)
3 years ago
LuGY 02b187c14f
[zero] add sampling time for memstats collector (#610)
3 years ago
ver217 9bee119104
[hotfix] fix sharded optim zero grad (#604)
3 years ago
アマデウス 297b8baae2
[model checkpoint] add gloo groups for cpu tensor communication (#589)
3 years ago
アマデウス 54e688b623
moved ensure_path_exists to utils.common (#591)
3 years ago
Jiarui Fang e956d93ac2
[refactor] memory utils (#577)
3 years ago
ver217 104cbbb313
[hotfix] add hybrid adam to __init__ (#584)
3 years ago
HELSON e6d50ec107
[zero] adapt zero for unsharded parameters (#561)
3 years ago
LuGY 13ed4b6441
[model zoo] add activation offload for gpt model (#582)
3 years ago
Wesley 46c9ba33da update code format
3 years ago
Wesley 666cfd094a fix parallel_input flag for Linear1D_Col gather_output
3 years ago
BoxiangW a9f778f1b1
[tool] create .clang-format for pre-commit (#578)
3 years ago
ver217 7c6c427db1
[zero] trace states of fp16/32 grad and fp32 param (#571)
3 years ago
Jiarui Fang 7675366fce
[polish] rename col_attr -> colo_attr (#558)
3 years ago
Liang Bowen 2c45efc398
html refactor (#555)
3 years ago
Jiarui Fang d1211148a7
[utils] update colo tensor moving APIs (#553)
3 years ago
LuGY c44d797072
[docs] updatad docs of hybrid adam and cpu adam (#552)
3 years ago
ver217 014bac0c49
[zero] hijack p.grad in sharded model (#554)
3 years ago
Jiarui Fang f552b11294
[zero] label state for param fp16 and grad (#551)
3 years ago
github-actions[bot] 92f4224867
Automated submodule synchronization (#501)
3 years ago
Jiarui Fang 214da761d4
[zero] add stateful tensor (#549)
3 years ago
Jiarui Fang 107b99ddb1
[zero] dump memory stats for sharded model (#548)
3 years ago
Ziyue Jiang 763dc325f1
[TP] Add gather_out arg to Linear (#541)
3 years ago
HELSON 8c90d4df54
[zero] add zero context manager to change config during initialization (#546)
3 years ago
Liang Bowen ec5086c49c Refactored docstring to google style
3 years ago
Jiarui Fang 53b1b6e340
[zero] non model data tracing (#545)
3 years ago
Jie Zhu 73d36618a6
[profiler] add MemProfiler (#356)
3 years ago
ver217 fb841dd5c5
[zero] optimize grad offload (#539)
3 years ago
Jiarui Fang 7d81b5b46e
[logging] polish logger format (#543)
3 years ago
ver217 1f90a3b129
[zero] polish ZeroInitContext (#540)
3 years ago
Jiarui Fang c11ff81b15
[zero] get memory usage of sharded optim v2. (#542)
3 years ago
HELSON a30e2b4c24
[zero] adapt for no-leaf module in zero (#535)
3 years ago
Jiarui Fang 705f56107c
[zero] refactor model data tracing (#537)
3 years ago
Jiarui Fang a590ed0ba3
[zero] improve the accuracy of get_memory_usage of sharded param (#538)
3 years ago
Jiarui Fang 37cb70feec
[zero] get memory usage for sharded param (#536)
3 years ago
ver217 56ad945797
update version (#533)
3 years ago
ver217 ffca99d187
[doc] update apidoc (#530)
3 years ago
Jiarui Fang 05e33b2578
[zero] fix grad offload (#528)
3 years ago