Commit Graph

19 Commits (f8b9aaef47d5a2b66db87b5d2b093639a66a131f)

Author SHA1 Message Date
Jiarui Fang c92f84fcdb
[tensor] distributed checkpointing for parameters (#1240) 2022-07-12 15:51:06 +08:00
Jiarui Fang 1aad903c15
[tensor] redistribute among different process groups (#1247)
* make it faster

* [tensor] rename convert_to_dist -> redistribute

* [tensor] ShardSpec and ReplicaSpec

* [tensor] redistribute among diff pgs

* polish code
2022-07-12 10:24:05 +08:00
Jiarui Fang 9bcd2fd4af
[tensor] a shorter shard and replicate spec (#1245) 2022-07-11 15:51:48 +08:00
HELSON f6add9b720
[tensor] redirect .data.__get__ to a tensor instance (#1239) 2022-07-11 11:41:29 +08:00
Jiarui Fang 4a76084dc9
[tensor] add zero_like colo op, important for Optimizer (#1236) 2022-07-08 14:55:27 +08:00
Jiarui Fang a98319f023
[tensor] torch function return colotensor (#1229) 2022-07-07 18:09:18 +08:00
Jiarui Fang 15d988f954
[tensor] sharded global process group (#1219) 2022-07-07 13:38:48 +08:00
Jiarui Fang ae7d3f4927
[refactor] move process group from _DistSpec to ColoTensor. (#1203) 2022-07-06 16:15:16 +08:00
Jiarui Fang 060b917daf
[refactor] remove gpc dependency in colotensor's _ops (#1189) 2022-07-04 18:54:37 +08:00
Jiarui Fang 7487215b95
[ColoTensor] add independent process group (#1179) 2022-06-29 10:03:09 +08:00
Jiarui Fang 1b657f9ce1
[tensor] revert local view back (#1178) 2022-06-27 18:38:34 +08:00
Jiarui Fang 0dd4e2bbfb
[Tensor] rename some APIs in TensorSpec and Polish view unittest (#1176) 2022-06-27 15:56:11 +08:00
Jiarui Fang aa7bef73d4
[Tensor] distributed view supports inter-process hybrid parallel (#1169) 2022-06-27 09:45:26 +08:00
Jiarui Fang 4b9bba8116
[ColoTensor] rename APIs and add output_replicate to ComputeSpec (#1168) 2022-06-24 13:08:54 +08:00
Jiarui Fang 8cdce0399c
[ColoTensor] improves init functions. (#1150) 2022-06-21 18:28:38 +08:00
Ziyue Jiang 7c530b9de2
[Tensor] add Parameter inheritance for ColoParameter (#1041)
* add Parameter inheritance for ColoParameter

* remove tricks

* remove tricks

* polish

* polish
2022-05-30 17:23:44 +08:00
ver217 ad536e308e
[tensor] refactor colo-tensor (#992)
* refactor colo-tensor and update linear op

* polish code

* polish code

* update ops and unit tests

* update unit tests

* polish code

* rename dist_spec module

* polish code

* polish code

* remove unneeded import

* fix pipelinable
2022-05-19 12:44:59 +08:00
Jiarui Fang 72cdc06875
[Tensor] make ColoTensor more robust for getattr (#886)
* [Tensor] make ColoTensor more robust for getattr

* polish

* polish
2022-04-27 10:57:49 +08:00
Jiarui Fang 909211453b
[Tensor] Add some attributes to ColoTensor (#877)
* [Tensor] add some function to ColoTensor

* torch.allclose

* rm torch.add
2022-04-26 15:10:47 +08:00