Commit Graph

4 Commits (879301d0dad26a056e0e90d8c9c9d6cc4a662c9a)

Author SHA1 Message Date
Hongxin Liu 5e1a9d48dd [cluster] add process group mesh (#4039)
* [cluster] add process group mesh

* [test] add process group mesh test

* force sync
2023-08-15 23:25:14 +08:00
Hongxin Liu afb239bbf8
[devops] update torch version of CI (#3725)
* [test] fix flop tensor test

* [test] fix autochunk test

* [test] fix lazyinit test

* [devops] update torch version of CI

* [devops] enable testmon

* [devops] fix ci

* [devops] fix ci

* [test] fix checkpoint io test

* [test] fix cluster test

* [test] fix timm test

* [devops] fix ci

* [devops] fix ci

* [devops] fix ci

* [devops] fix ci

* [devops] force sync to test ci

* [test] skip fsdp test
2023-05-15 17:20:56 +08:00
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452)
* [test] added spawn decorator

* polish code

* polish code

* polish code

* polish code

* polish code

* polish code
2023-04-06 14:51:35 +08:00
YuliangLiu0306 4d5d8f98a4
[API] implement device mesh manager (#3221)
* [API] implement  device mesh manager

* polish
2023-03-24 13:39:12 +08:00