Jiarui Fang
|
556b9b7e1a
|
[hotfix] Dist Mgr gather torch version (#1284)
* make it faster
* [hotfix] torchvison fx tests
* [hotfix] rename duplicated named test_gpt.py
* [hotfix] dist mgr torch version
|
2022-07-13 00:18:56 +08:00 |
Jiarui Fang
|
ae7d3f4927
|
[refactor] move process group from _DistSpec to ColoTensor. (#1203)
|
2022-07-06 16:15:16 +08:00 |
Jiarui Fang
|
b5f25eb32a
|
[Tensor] add cpu group to ddp (#1200)
|
2022-07-05 14:58:28 +08:00 |
Jiarui Fang
|
060b917daf
|
[refactor] remove gpc dependency in colotensor's _ops (#1189)
|
2022-07-04 18:54:37 +08:00 |
Jiarui Fang
|
aa7bef73d4
|
[Tensor] distributed view supports inter-process hybrid parallel (#1169)
|
2022-06-27 09:45:26 +08:00 |
ver217
|
634eecb98e
|
mark sanity_check of dist_spec_mgr as staticmethod (#1161)
|
2022-06-23 11:35:25 +08:00 |
ver217
|
ffa025e120
|
[tensor] dist spec s2s uses all-to-all (#1136)
* dist spec s2s uses all-to-all
* update unit test
* add sanity check
* polish unitest test with titans
* add sanity check for DistMgr
* add sanity check
Co-authored-by: jiaruifang <fangjiarui123@gmail.com>
|
2022-06-22 11:32:38 +08:00 |
Jiarui Fang
|
8cdce0399c
|
[ColoTensor] improves init functions. (#1150)
|
2022-06-21 18:28:38 +08:00 |
Jiarui Fang
|
a00644079e
|
reorgnize colotensor directory (#1062)
* reorgnize colotensor directory
* polish code
|
2022-06-03 18:04:22 +08:00 |
ver217
|
7faef93326
|
fix dist spec mgr (#1045)
|
2022-05-31 12:14:39 +08:00 |
ver217
|
ad536e308e
|
[tensor] refactor colo-tensor (#992)
* refactor colo-tensor and update linear op
* polish code
* polish code
* update ops and unit tests
* update unit tests
* polish code
* rename dist_spec module
* polish code
* polish code
* remove unneeded import
* fix pipelinable
|
2022-05-19 12:44:59 +08:00 |
Jiarui Fang
|
802ac297cc
|
[Tensor] remove useless import in tensor dir (#997)
|
2022-05-18 14:54:51 +08:00 |
Ziyue Jiang
|
797a9dc5a9
|
add DistSpec for loss and test_model (#947)
|
2022-05-13 20:29:50 +08:00 |
ver217
|
67c33f57eb
|
[tensor] design DistSpec and DistSpecManager for ColoTensor (#934)
* add dist spec
* update linear op
* polish code
* polish code
* update embedding op
* polish unit tests
* polish unit tests
* polish comments
* polish code
* add test_dist_spec_mgr
* polish code
* refactor folder structure
* polish unit tests
* add get_process_group() for TensorSpec
* polish code
|
2022-05-13 15:13:52 +08:00 |