Commit Graph

9 Commits (e1620ddac25184aeef816a261ca9581893e34072)

Author SHA1 Message Date
Frank Lee cb18922c47
[doc] added documentation to chunk and chunk manager (#1094)
* [doc] added documentation to chunk and chunk manager

* polish code

* polish code

* polish code
2022-06-10 15:33:06 +08:00
ver217 1f894e033f
[gemini] zero supports gemini (#1093)
* add placement policy

* add gemini mgr

* update mem stats collector

* update zero

* update zero optim

* fix bugs

* zero optim monitor os

* polish unit test

* polish unit test

* add assert
2022-06-10 14:48:28 +08:00
ver217 be01db37c8
[tensor] refactor chunk mgr and impl MemStatsCollectorV2 (#1077)
* polish chunk manager

* polish unit test

* impl add_extern_static_tensor for chunk mgr

* add mem stats collector v2

* polish code

* polish unit test

* polish code

* polish get chunks
2022-06-09 20:56:34 +08:00
ver217 1b17859328
[tensor] chunk manager monitor mem usage (#1076) 2022-06-07 15:00:00 +08:00
ver217 98cdbf49c6
[hotfix] fix chunk comm src rank (#1072) 2022-06-07 11:54:56 +08:00
ver217 c5cd3b0f35
[zero] zero optim copy chunk rather than copy tensor (#1070) 2022-06-07 10:30:46 +08:00
ver217 e1922ea4f6
[zero] add chunk size search for chunk manager (#1052) 2022-06-02 13:20:20 +08:00
ver217 51b9a49655
[zero] add zero optimizer for ColoTensor (#1046)
* add zero optimizer

* torch ok

* unit test ok

* polish code

* fix bugs

* polish unit test

* polish zero optim

* polish colo ddp v2

* refactor folder structure

* add comment

* polish unit test

* polish zero optim

* polish unit test
2022-06-02 12:13:15 +08:00
ver217 9492a561c3
[tensor] ColoTensor supports ZeRo (#1015)
* impl chunk manager

* impl param op hook

* add reduce_chunk

* add zero hook v2

* add zero dp

* fix TensorInfo

* impl load balancing when using zero without chunk

* fix zero hook

* polish chunk

* fix bugs

* ddp ok

* zero ok

* polish code

* fix bugs about load balancing

* polish code

* polish code

* add ene-to-end test

* polish code

* polish code

* polish code

* fix typo

* add test_chunk

* fix bugs

* fix bugs

* polish code
2022-05-31 12:00:12 +08:00