Commit Graph

22 Commits (ec18fc7340f99693f2436e91e1dea99342f476d5)

Author SHA1 Message Date
HELSON 552183bb74
[polish] polish ColoTensor and its submodules (#2537) 2023-02-03 11:44:10 +08:00
HELSON 2458659919
[zero] fix error for BEiT models (#2169)
* [zero] fix error for BEiT models

* [ColoParameter] add unpack operation for tuple arguments

* fix bugs

* fix chunkv2 unit testing

* add assertion for gradient state
2022-12-26 15:03:54 +08:00
Jiarui Fang b3b89865e2
[Gemini] ParamOpHook -> ColoParamOpHook (#2080) 2022-12-05 17:11:06 +08:00
YuliangLiu0306 49216d7ab1
[autoparallel] fix bugs caused by negative dim key (#1808)
* [autoparallel] fix bugs caused by negative dim key

* fix import error

* fix matmul test issue

* fix unit test issue
2022-11-08 17:03:50 +08:00
Jiarui Fang 85f933b58b
[Optimizer] Remove useless ColoOptimizer (#1312) 2022-07-14 16:57:48 +08:00
Jiarui Fang 4a76084dc9
[tensor] add zero_like colo op, important for Optimizer (#1236) 2022-07-08 14:55:27 +08:00
Jiarui Fang ae7d3f4927
[refactor] move process group from _DistSpec to ColoTensor. (#1203) 2022-07-06 16:15:16 +08:00
Jiarui Fang 1b657f9ce1
[tensor] revert local view back (#1178) 2022-06-27 18:38:34 +08:00
Jiarui Fang aa7bef73d4
[Tensor] distributed view supports inter-process hybrid parallel (#1169) 2022-06-27 09:45:26 +08:00
Jiarui Fang 4b9bba8116
[ColoTensor] rename APIs and add output_replicate to ComputeSpec (#1168) 2022-06-24 13:08:54 +08:00
Jiarui Fang f4ef224358
[Tensor] remove ParallelAction, use ComputeSpec instread (#1166) 2022-06-23 17:34:59 +08:00
Jiarui Fang 8cdce0399c
[ColoTensor] improves init functions. (#1150) 2022-06-21 18:28:38 +08:00
ver217 789cad301b
[hotfix] fix param op hook (#1131)
* fix param op hook

* update zero tp test

* fix bugs
2022-06-17 16:12:05 +08:00
ver217 f99f56dff4
fix colo parameter torch function (#1117) 2022-06-15 14:23:27 +08:00
ver217 895c1c5ee7
[tensor] refactor param op hook (#1097)
* refactor param op hook

* add docstr

* fix bug
2022-06-13 16:11:53 +08:00
Jiarui Fang a00644079e
reorgnize colotensor directory (#1062)
* reorgnize colotensor directory

* polish code
2022-06-03 18:04:22 +08:00
ver217 9492a561c3
[tensor] ColoTensor supports ZeRo (#1015)
* impl chunk manager

* impl param op hook

* add reduce_chunk

* add zero hook v2

* add zero dp

* fix TensorInfo

* impl load balancing when using zero without chunk

* fix zero hook

* polish chunk

* fix bugs

* ddp ok

* zero ok

* polish code

* fix bugs about load balancing

* polish code

* polish code

* add ene-to-end test

* polish code

* polish code

* polish code

* fix typo

* add test_chunk

* fix bugs

* fix bugs

* polish code
2022-05-31 12:00:12 +08:00
Ziyue Jiang 7c530b9de2
[Tensor] add Parameter inheritance for ColoParameter (#1041)
* add Parameter inheritance for ColoParameter

* remove tricks

* remove tricks

* polish

* polish
2022-05-30 17:23:44 +08:00
Ziyue Jiang 6c5996a56e
[Tensor] add module check and bert test (#1031)
* add Embedding

* Add bert test

* polish

* add check module test

* polish

* polish

* polish

* polish
2022-05-26 18:15:42 +08:00
ver217 a3b66f6def
[tensor] refactor parallel action (#1007)
* refactor parallel action

* polish unit tests
2022-05-20 20:19:58 +08:00
ver217 ad536e308e
[tensor] refactor colo-tensor (#992)
* refactor colo-tensor and update linear op

* polish code

* polish code

* update ops and unit tests

* update unit tests

* polish code

* rename dist_spec module

* polish code

* polish code

* remove unneeded import

* fix pipelinable
2022-05-19 12:44:59 +08:00
Jiarui Fang ab95ec9aea
[Tensor] init ColoParameter (#914) 2022-05-06 12:57:14 +08:00