Commit Graph

24 Commits (0442f940f021d024ca390485f0cdf0856fe6cb36)

Author SHA1 Message Date
ver217 04c9a86af8
[zero] ZeroDDP supports controlling outputs' dtype (#1399)
2 years ago
HELSON 4e98e938ce
[zero] alleviate memory usage in ZeRODDP state_dict (#1398)
2 years ago
ver217 83328329dd
[hotfix] fix zero ddp buffer cast (#1376)
2 years ago
ver217 5d5031e946
fix zero ddp state dict (#1378)
2 years ago
HELSON 87775a0682
[colotensor] use cpu memory to store state_dict (#1367)
2 years ago
ver217 d068af81a3
[doc] update rst and docstring (#1351)
2 years ago
ver217 0c51ff2c13
[hotfix] ZeroDDP use new process group (#1333)
2 years ago
Jiarui Fang b5f25eb32a
[Tensor] add cpu group to ddp (#1200)
2 years ago
Jiarui Fang 060b917daf
[refactor] remove gpc dependency in colotensor's _ops (#1189)
2 years ago
Jiarui Fang 372f791444
[refactor] move chunk and chunkmgr to directory gemini (#1182)
2 years ago
ver217 6b2f2ab9bb
[ddp] ColoDDP uses bucket all-reduce (#1177)
2 years ago
ver217 54aabb8da4
[gemini] refactor gemini mgr (#1151)
2 years ago
ver217 8106d7b8c7
[ddp] refactor ColoDDP and ZeroDDP (#1146)
2 years ago
Frank Lee 15aab1476e
[zero] avoid zero hook spam by changing log to debug level (#1137)
2 years ago
ver217 d26902645e
[ddp] add save/load state dict for ColoDDP (#1127)
2 years ago
ver217 f0a954f16d
[ddp] add set_params_to_ignore for ColoDDP (#1122)
2 years ago
ver217 e127b4375b
cast colo ddp v2 inputs/outputs (#1120)
2 years ago
ver217 7d14b473f0
[gemini] gemini mgr supports "cpu" placement policy (#1118)
2 years ago
ver217 895c1c5ee7
[tensor] refactor param op hook (#1097)
3 years ago
Frank Lee cb18922c47
[doc] added documentation to chunk and chunk manager (#1094)
3 years ago
ver217 1f894e033f
[gemini] zero supports gemini (#1093)
3 years ago
ver217 be01db37c8
[tensor] refactor chunk mgr and impl MemStatsCollectorV2 (#1077)
3 years ago
Ziyue Jiang 4fc748f69b
[Tensor] fix optimizer for CPU parallel (#1069)
3 years ago
Jiarui Fang 49832b2344
[refactory] add nn.parallel module (#1068)
3 years ago