digger yu
|
0e484e6201
|
[nfc]fix typo colossalai/pipeline tensor nn (#3899)
* fix typo colossalai/autochunk auto_parallel amp
* fix typo colossalai/auto_parallel nn utils etc.
* fix typo colossalai/auto_parallel autochunk fx/passes etc.
* fix typo docs/
* change placememt_policy to placement_policy in docs/ and examples/
* fix typo colossalai/ applications/
* fix typo colossalai/cli fx kernel
* fix typo colossalai/nn
* revert change warmuped
* fix typo colossalai/pipeline tensor nn
|
2 years ago |
HELSON
|
552183bb74
|
[polish] polish ColoTensor and its submodules (#2537)
|
2 years ago |
Jiarui Fang
|
1b491ad7de
|
[doc] update docstring in ProcessGroup (#1468)
|
2 years ago |
Jiarui Fang
|
a1476ea882
|
[NFC] polish doc style for ColoTensor (#1457)
|
2 years ago |
HELSON
|
c7221cb2d4
|
[hotfix] adapt ProcessGroup and Optimizer to ColoTensor (#1388)
|
2 years ago |
ver217
|
828b9e5e0d
|
[hotfix] fix zero optim save/load state dict (#1381)
|
2 years ago |
HELSON
|
f92c100ddd
|
[checkpoint] use gather_tensor in checkpoint and update its unit test (#1339)
|
2 years ago |
ver217
|
0c51ff2c13
|
[hotfix] ZeroDDP use new process group (#1333)
* process group supports getting ranks in group
* chunk mgr receives a process group
* update unit test
* fix unit tests
|
2 years ago |
HELSON
|
d49708ae43
|
[hotfix] fix ddp for unit test test_gpt2 (#1326)
|
2 years ago |
Jiarui Fang
|
9f10524313
|
[Optimizer] polish the init method of ColoOptimizer (#1310)
|
2 years ago |
Jiarui Fang
|
1aad903c15
|
[tensor] redistribute among different process groups (#1247)
* make it faster
* [tensor] rename convert_to_dist -> redistribute
* [tensor] ShardSpec and ReplicaSpec
* [tensor] redistribute among diff pgs
* polish code
|
2 years ago |
Jiarui Fang
|
20da6e48c8
|
[checkpoint] save sharded optimizer states (#1237)
|
2 years ago |
HELSON
|
f071b500b6
|
[polish] polish __repr__ for ColoTensor, DistSpec, ProcessGroup (#1235)
|
2 years ago |
Jiarui Fang
|
a98319f023
|
[tensor] torch function return colotensor (#1229)
|
2 years ago |
HELSON
|
280a81243d
|
[tensor] improve robustness of class 'ProcessGroup' (#1223)
|
2 years ago |
Jiarui Fang
|
15d988f954
|
[tensor] sharded global process group (#1219)
|
2 years ago |
Jiarui Fang
|
ae7d3f4927
|
[refactor] move process group from _DistSpec to ColoTensor. (#1203)
|
2 years ago |
Jiarui Fang
|
b5f25eb32a
|
[Tensor] add cpu group to ddp (#1200)
|
2 years ago |
Jiarui Fang
|
060b917daf
|
[refactor] remove gpc dependency in colotensor's _ops (#1189)
|
2 years ago |
Jiarui Fang
|
c463f8adf9
|
[tensor] remove gpc in tensor tests (#1186)
|
2 years ago |
Jiarui Fang
|
7487215b95
|
[ColoTensor] add independent process group (#1179)
|
2 years ago |