Commit Graph

7 Commits (f1e1836218a4ecc6e8acf85c7031492cc4e6b590)

Author SHA1 Message Date
ver217 8dced41ad0
[zero] zero optim state_dict takes only_rank_0 (#1384)
* zero optim state_dict takes only_rank_0

* fix unit test
2022-07-29 13:22:50 +08:00
ver217 828b9e5e0d
[hotfix] fix zero optim save/load state dict (#1381) 2022-07-28 17:19:39 +08:00
HELSON 7a8702c06d
[colotensor] add Tensor.view op and its unit test (#1343)
[colotensor] add megatron initialization for gpt2
2022-07-21 10:53:15 +08:00
ver217 0c51ff2c13
[hotfix] ZeroDDP use new process group (#1333)
* process group supports getting ranks in group

* chunk mgr receives a process group

* update unit test

* fix unit tests
2022-07-18 14:14:52 +08:00
Jiarui Fang 060b917daf
[refactor] remove gpc dependency in colotensor's _ops (#1189) 2022-07-04 18:54:37 +08:00
Jiarui Fang 372f791444
[refactor] move chunk and chunkmgr to directory gemini (#1182) 2022-06-29 13:31:02 +08:00
ver217 561e90493f
[zero] zero optim supports loading local state dict (#1171)
* zero optim supports loading local state dict

* polish code

* add unit test
2022-06-24 17:25:57 +08:00