Frank Lee
|
65ee6dcc20
|
[test] ignore 8 gpu test (#1080)
* [test] ignore 8 gpu test
* polish code
* polish workflow
* polish workflow
|
2022-06-08 23:14:18 +08:00 |
HELSON
|
e5ea3fdeef
|
[gemini] add GeminiMemoryManger (#832)
* refactor StatefulTensor, tensor utilities
* add unitest for GeminiMemoryManager
|
2022-04-24 13:08:48 +08:00 |
Jiarui Fang
|
e761ad2cd7
|
Revert "[zero] add ZeroTensorShardStrategy (#793)" (#806)
|
2022-04-19 14:40:02 +08:00 |
HELSON
|
88759e289e
|
[zero] add ZeroTensorShardStrategy (#793)
|
2022-04-19 14:32:45 +08:00 |
Jiarui Fang
|
4d9332b4c5
|
[refactor] moving memtracer to gemini (#801)
|
2022-04-19 10:13:08 +08:00 |
HELSON
|
4c4388c46e
|
[hotfix] fix memory leak in zero (#781)
|
2022-04-18 13:57:03 +08:00 |
Frank Lee
|
5a1a095b92
|
[test] refactored with the new rerun decorator (#763)
* [test] refactored with the new rerun decorator
* polish test case
|
2022-04-15 00:33:04 +08:00 |
Jiarui Fang
|
10ef8afdd2
|
[gemini] init genimi individual directory (#754)
|
2022-04-14 16:40:26 +08:00 |
ver217
|
dcca614eee
|
[hotfix] fix test_stateful_tensor_mgr (#762)
|
2022-04-14 15:50:09 +08:00 |
ver217
|
a93a7d7364
|
[hotfix] fix reuse_fp16_shard of sharded model (#756)
* fix reuse_fp16_shard
* disable test stm
* polish code
|
2022-04-14 14:56:46 +08:00 |
HELSON
|
84c6700b2a
|
[zero] refactor memstats_collector (#746)
|
2022-04-14 12:01:12 +08:00 |
ver217
|
e396bb71f2
|
[zero] add tensor placement policies (#743)
* add tensor placement policies
* polish comments
* polish comments
* update moe unit tests
|
2022-04-13 15:00:48 +08:00 |
HELSON
|
22c4b88d56
|
[zero] refactor ShardedParamV2 for convenience (#742)
|
2022-04-13 14:54:26 +08:00 |
Frank Lee
|
f4f42d4c3c
|
[bug] fixed DDP compatibility with torch 1.8 (#739)
|
2022-04-13 00:08:46 +08:00 |
Jiarui Fang
|
53cb584808
|
[utils] correct cpu memory used and capacity in the context of multi-process (#726)
|
2022-04-12 14:57:54 +08:00 |