CsRic
a389ac4ec9
[embedding] cache_embedding small improvement ( #1564 )
2022-09-08 16:41:19 +08:00
Jiarui Fang
64169f3e8f
[embedding] polish parallel embedding tablewise ( #1545 )
2022-09-06 10:41:20 +08:00
CsRic
964123ae0f
[embedding] freq_aware_embedding: add small functions for caller application ( #1537 )
2022-09-05 15:12:53 +08:00
Jiarui Fang
521078ffc9
[embedding] fix a bug in table wise sharding ( #1538 )
2022-09-02 15:48:35 +08:00
Jiarui Fang
87134524fd
[embedding] tablewise sharding polish ( #1535 )
2022-09-02 11:09:37 +08:00
CsRic
5156d5b4f8
[embedding] add tablewise sharding for FAW ( #1526 )
2022-09-01 17:55:41 +08:00
Jiarui Fang
4537d39df9
[doc] docstring for FreqAwareEmbeddingBag ( #1525 )
2022-08-31 13:52:30 +08:00
Jiarui Fang
9a9ef65313
[FAW] cpu caching operations ( #1520 )
2022-08-30 14:50:02 +08:00
Jiarui Fang
af5438caa2
[FAW] refactor reorder() for CachedParamMgr ( #1514 )
2022-08-29 14:22:07 +08:00
Jiarui Fang
9feee6d06b
[FAW] LFU initialize with dataset freq ( #1513 )
2022-08-29 12:52:53 +08:00
CsRic
1b8fee8e9c
[FAW] shrink freq_cnter size ( #1509 )
2022-08-29 11:44:55 +08:00
Jiarui Fang
ba61109b6c
[FAW] remove code related to chunk ( #1501 )
2022-08-26 14:23:30 +08:00
Jiarui Fang
d5085bb317
[FAW] add more docs and fix a warning ( #1500 )
2022-08-26 14:10:21 +08:00
CsRic
0ed2f46131
[FAW] FAW embedding use LRU as eviction strategy intialized with dataset stats ( #1494 )
2022-08-26 11:24:12 +08:00
CsRic
b8d0e39eaf
[FAW] LFU cache for the FAW
2022-08-25 13:08:46 +08:00
Jiarui Fang
cde7b8a5b8
[FAW] init an LFU implementation for FAW ( #1488 )
2022-08-24 17:37:22 +08:00
Geng Zhang
0aad53c62b
[FCE] update interface for frequency statistics in FreqCacheEmbedding ( #1462 )
2022-08-23 17:38:24 +08:00
Jiarui Fang
a1476ea882
[NFC] polish doc style for ColoTensor ( #1457 )
2022-08-16 09:21:05 +08:00
Geng Zhang
9f3eed66eb
[FAW] reorganize the inheritance struct of FreqCacheEmbedding ( #1448 )
2022-08-12 15:55:46 +08:00
Jiarui Fang
30b4dd17c0
[FAW] export FAW in _ops ( #1438 )
2022-08-11 13:43:24 +08:00
ver217
04c9a86af8
[zero] ZeroDDP supports controlling outputs' dtype ( #1399 )
2022-08-02 17:49:11 +08:00
HELSON
4e98e938ce
[zero] alleviate memory usage in ZeRODDP state_dict ( #1398 )
2022-08-02 15:49:13 +08:00
ver217
83328329dd
[hotfix] fix zero ddp buffer cast ( #1376 )
...
* fix zero ddp buffer cast
* fix zero ddp ignore params
2022-07-28 10:54:44 +08:00
ver217
5d5031e946
fix zero ddp state dict ( #1378 )
2022-07-28 09:31:42 +08:00
HELSON
87775a0682
[colotensor] use cpu memory to store state_dict ( #1367 )
2022-07-26 14:13:38 +08:00
ver217
d068af81a3
[doc] update rst and docstring ( #1351 )
...
* update rst
* add zero docstr
* fix docstr
* remove fx.tracer.meta_patch
* fix docstr
* fix docstr
* update fx rst
* fix fx docstr
* remove useless rst
2022-07-21 15:54:53 +08:00
ver217
0c51ff2c13
[hotfix] ZeroDDP use new process group ( #1333 )
...
* process group supports getting ranks in group
* chunk mgr receives a process group
* update unit test
* fix unit tests
2022-07-18 14:14:52 +08:00
HELSON
1b41686461
[hotfix] fix unit test test_module_spec ( #1321 )
2022-07-15 14:02:32 +08:00
Jiarui Fang
9bcd2fd4af
[tensor] a shorter shard and replicate spec ( #1245 )
2022-07-11 15:51:48 +08:00
Jiarui Fang
ae7d3f4927
[refactor] move process group from _DistSpec to ColoTensor. ( #1203 )
2022-07-06 16:15:16 +08:00
Jiarui Fang
b5f25eb32a
[Tensor] add cpu group to ddp ( #1200 )
2022-07-05 14:58:28 +08:00
Jiarui Fang
060b917daf
[refactor] remove gpc dependency in colotensor's _ops ( #1189 )
2022-07-04 18:54:37 +08:00
Jiarui Fang
372f791444
[refactor] move chunk and chunkmgr to directory gemini ( #1182 )
2022-06-29 13:31:02 +08:00
ver217
6b2f2ab9bb
[ddp] ColoDDP uses bucket all-reduce ( #1177 )
...
* add reducer
* update colo ddp with reducer
* polish unit test
* polish unit test
2022-06-29 10:34:13 +08:00
Ziyue Jiang
dd0420909f
[Tensor] rename parallel_action ( #1174 )
...
* rename parallel_action
* polish
2022-06-27 10:04:45 +08:00
Jiarui Fang
4b9bba8116
[ColoTensor] rename APIs and add output_replicate to ComputeSpec ( #1168 )
2022-06-24 13:08:54 +08:00
Jiarui Fang
f4ef224358
[Tensor] remove ParallelAction, use ComputeSpec instread ( #1166 )
2022-06-23 17:34:59 +08:00
ver217
54aabb8da4
[gemini] refactor gemini mgr ( #1151 )
...
* refactor gemini mgr
* udpate __init__
2022-06-22 11:54:36 +08:00
ver217
8106d7b8c7
[ddp] refactor ColoDDP and ZeroDDP ( #1146 )
...
* ColoDDP supports overwriting default process group
* rename ColoDDPV2 to ZeroDDP
* add docstr for ZeroDDP
* polish docstr
2022-06-21 16:35:23 +08:00
Frank Lee
15aab1476e
[zero] avoid zero hook spam by changing log to debug level ( #1137 )
2022-06-21 10:44:01 +08:00
ver217
d26902645e
[ddp] add save/load state dict for ColoDDP ( #1127 )
...
* add save/load state dict for ColoDDP
* add unit test
* refactor unit test folder
* polish unit test
* rename unit test
2022-06-20 10:51:47 +08:00
ver217
f0a954f16d
[ddp] add set_params_to_ignore for ColoDDP ( #1122 )
...
* add set_params_to_ignore for ColoDDP
* polish code
* fix zero hook v2
* add unit test
* polish docstr
2022-06-16 12:54:46 +08:00
ver217
e127b4375b
cast colo ddp v2 inputs/outputs ( #1120 )
2022-06-15 15:57:04 +08:00
ver217
7d14b473f0
[gemini] gemini mgr supports "cpu" placement policy ( #1118 )
...
* update gemini mgr
* update chunk
* add docstr
* polish placement policy
* update test chunk
* update test zero
* polish unit test
* remove useless unit test
2022-06-15 15:05:19 +08:00
ver217
895c1c5ee7
[tensor] refactor param op hook ( #1097 )
...
* refactor param op hook
* add docstr
* fix bug
2022-06-13 16:11:53 +08:00
Frank Lee
cb18922c47
[doc] added documentation to chunk and chunk manager ( #1094 )
...
* [doc] added documentation to chunk and chunk manager
* polish code
* polish code
* polish code
2022-06-10 15:33:06 +08:00
ver217
1f894e033f
[gemini] zero supports gemini ( #1093 )
...
* add placement policy
* add gemini mgr
* update mem stats collector
* update zero
* update zero optim
* fix bugs
* zero optim monitor os
* polish unit test
* polish unit test
* add assert
2022-06-10 14:48:28 +08:00
ver217
be01db37c8
[tensor] refactor chunk mgr and impl MemStatsCollectorV2 ( #1077 )
...
* polish chunk manager
* polish unit test
* impl add_extern_static_tensor for chunk mgr
* add mem stats collector v2
* polish code
* polish unit test
* polish code
* polish get chunks
2022-06-09 20:56:34 +08:00
Ziyue Jiang
4fc748f69b
[Tensor] fix optimizer for CPU parallel ( #1069 )
2022-06-06 17:36:11 +08:00
Jiarui Fang
49832b2344
[refactory] add nn.parallel module ( #1068 )
2022-06-06 15:34:41 +08:00