Commit Graph

17 Commits (822c3d4d66d2d74cb7c7080abed6a207602dddfd)

Author SHA1 Message Date
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452)
2 years ago
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424)
2 years ago
HELSON 707b11d4a0
[gemini] update ddp strict mode (#2518)
2 years ago
HELSON f69f9bf223
[zero] add chunk init function for users (#1729)
2 years ago
HELSON b28991dd0a
[feature] A new ZeRO implementation (#1644)
2 years ago
Jiarui Fang c5d39215f6
Revert "[feature] new zero implementation (#1623)" (#1643)
2 years ago
HELSON 5be118f405
[feature] new zero implementation (#1623)
2 years ago
HELSON 4e98e938ce
[zero] alleviate memory usage in ZeRODDP state_dict (#1398)
2 years ago
ver217 7d5d628e07
[DDP] test ddp state dict uses more strict threshold (#1382)
2 years ago
HELSON 87775a0682
[colotensor] use cpu memory to store state_dict (#1367)
2 years ago
ver217 0c51ff2c13
[hotfix] ZeroDDP use new process group (#1333)
2 years ago
Jiarui Fang 3b500984b1
[tensor] fix some unittests (#1234)
2 years ago
Jiarui Fang 060b917daf
[refactor] remove gpc dependency in colotensor's _ops (#1189)
2 years ago
Jiarui Fang 372f791444
[refactor] move chunk and chunkmgr to directory gemini (#1182)
2 years ago
ver217 6b2f2ab9bb
[ddp] ColoDDP uses bucket all-reduce (#1177)
2 years ago
ver217 8106d7b8c7
[ddp] refactor ColoDDP and ZeroDDP (#1146)
2 years ago
ver217 d26902645e
[ddp] add save/load state dict for ColoDDP (#1127)
2 years ago