Frank Lee
14e5b11d7f
[zero] fixed api consistency ( #1098 )
2022-06-10 16:59:59 +08:00
ver217
1f894e033f
[gemini] zero supports gemini ( #1093 )
...
* add placement policy
* add gemini mgr
* update mem stats collector
* update zero
* update zero optim
* fix bugs
* zero optim monitor os
* polish unit test
* polish unit test
* add assert
2022-06-10 14:48:28 +08:00
ver217
be01db37c8
[tensor] refactor chunk mgr and impl MemStatsCollectorV2 ( #1077 )
...
* polish chunk manager
* polish unit test
* impl add_extern_static_tensor for chunk mgr
* add mem stats collector v2
* polish code
* polish unit test
* polish code
* polish get chunks
2022-06-09 20:56:34 +08:00
ver217
c4d903e64a
[gemini] accelerate adjust_layout() ( #878 )
...
* add lru cache
* polish code
* update unit test
* fix sharded optim
2022-04-26 18:08:31 +08:00
HELSON
425b4a96b8
[gemini] polish stateful_tensor_mgr ( #876 )
2022-04-26 15:05:03 +08:00
HELSON
3107817172
[gemini] add stateful tensor container ( #867 )
2022-04-25 14:58:16 +08:00
HELSON
f0e654558f
[gemini] polish code ( #855 )
2022-04-25 10:40:14 +08:00
ver217
d7e0303d1e
[zero] use GeminiMemoryManager when sampling model data ( #850 )
2022-04-24 17:17:22 +08:00
ver217
0dea140760
[hotfix] add deconstructor for stateful tensor ( #848 )
...
* add deconstructor for stateful tensor
* fix colo init context
2022-04-24 15:03:04 +08:00
HELSON
e5ea3fdeef
[gemini] add GeminiMemoryManger ( #832 )
...
* refactor StatefulTensor, tensor utilities
* add unitest for GeminiMemoryManager
2022-04-24 13:08:48 +08:00
Jiarui Fang
0ce8924ceb
[tensor] reorganize files ( #820 )
2022-04-21 14:15:48 +08:00
Jiarui Fang
ab962b9735
[gemini] a new tensor structure ( #818 )
...
* Revert "[zero] add ZeroTensorShardStrategy (#793 )"
This reverts commit 88759e289e
.
* [gemini] set cpu memory capacity
* [log] local throughput collecting
* polish
* polish
* polish
* polish code
* polish
* polish code
* add a new tensor structure and override linear for it
* polish
* polish
* polish
* polish
* polish
* polish
* polish
* polish
* polish
* polish
* polish
2022-04-21 11:42:37 +08:00
Jiarui Fang
3ddbd1bce1
[gemini] collect cpu-gpu moving volume in each iteration ( #813 )
2022-04-20 11:29:48 +08:00
Jiarui Fang
681addb512
[refactor] moving grad acc logic to engine ( #804 )
2022-04-19 14:03:21 +08:00
Jiarui Fang
4d9332b4c5
[refactor] moving memtracer to gemini ( #801 )
2022-04-19 10:13:08 +08:00
ver217
846406a07a
[gemini] fix auto tensor placement policy ( #775 )
2022-04-16 21:29:31 +08:00
Jiarui Fang
10ef8afdd2
[gemini] init genimi individual directory ( #754 )
2022-04-14 16:40:26 +08:00