Jiarui Fang
|
21962e1593
|
[embedding] rename FreqAwareEmbedding -> CachedEmbedding (#1699)
|
2 years ago |
Jiarui Fang
|
363fc2861a
|
[embeddings] more detailed timer (#1692)
|
2 years ago |
Jiarui Fang
|
c638bec028
|
[embedding] polish async copy (#1657)
|
2 years ago |
Jiarui Fang
|
988570e4a6
|
[embedding] add more detail profiling (#1656)
|
2 years ago |
Jiarui Fang
|
e1f97fd2b8
|
[embedding] print profiling results (#1654)
|
2 years ago |
Jiarui Fang
|
04443605a5
|
[embedding] non-blocking cpu-gpu copy (#1647)
|
2 years ago |
CsRic
|
0767f67a0f
|
[embedding] isolate cache_op from forward (#1645)
Co-authored-by: ric <mkkt_bkkt@mail.ustc.edu.cn>
|
2 years ago |
Jiarui Fang
|
e57df80325
|
[embeddings] cache option (#1635)
|
2 years ago |
Jiarui Fang
|
38c68b5b9a
|
[embedding] rollback for better FAW performance (#1625)
|
2 years ago |
Jiarui Fang
|
504ff1d101
|
[embeddings] use cache_ratio instead of cuda_row_num (#1611)
|
2 years ago |
Jiarui Fang
|
a19eb80998
|
[embedding] updates some default parameters
|
2 years ago |
CsRic
|
f3403ff98e
|
[embeddings] add already_split_along_rank flag for tablewise mode (#1584)
|
2 years ago |
CsRic
|
a389ac4ec9
|
[embedding] cache_embedding small improvement (#1564)
|
2 years ago |
Jiarui Fang
|
64169f3e8f
|
[embedding] polish parallel embedding tablewise (#1545)
|
2 years ago |
CsRic
|
964123ae0f
|
[embedding] freq_aware_embedding: add small functions for caller application (#1537)
|
2 years ago |
Jiarui Fang
|
521078ffc9
|
[embedding] fix a bug in table wise sharding (#1538)
|
2 years ago |
Jiarui Fang
|
87134524fd
|
[embedding] tablewise sharding polish (#1535)
|
2 years ago |
CsRic
|
5156d5b4f8
|
[embedding] add tablewise sharding for FAW (#1526)
|
2 years ago |
Jiarui Fang
|
4537d39df9
|
[doc] docstring for FreqAwareEmbeddingBag (#1525)
|
2 years ago |
Jiarui Fang
|
9a9ef65313
|
[FAW] cpu caching operations (#1520)
|
2 years ago |
Jiarui Fang
|
af5438caa2
|
[FAW] refactor reorder() for CachedParamMgr (#1514)
|
2 years ago |
Jiarui Fang
|
9feee6d06b
|
[FAW] LFU initialize with dataset freq (#1513)
|
2 years ago |
CsRic
|
1b8fee8e9c
|
[FAW] shrink freq_cnter size (#1509)
|
2 years ago |
Jiarui Fang
|
ba61109b6c
|
[FAW] remove code related to chunk (#1501)
|
2 years ago |
Jiarui Fang
|
d5085bb317
|
[FAW] add more docs and fix a warning (#1500)
|
2 years ago |
CsRic
|
0ed2f46131
|
[FAW] FAW embedding use LRU as eviction strategy intialized with dataset stats (#1494)
|
2 years ago |
CsRic
|
b8d0e39eaf
|
[FAW] LFU cache for the FAW
|
2 years ago |
Jiarui Fang
|
cde7b8a5b8
|
[FAW] init an LFU implementation for FAW (#1488)
|
2 years ago |
Geng Zhang
|
0aad53c62b
|
[FCE] update interface for frequency statistics in FreqCacheEmbedding (#1462)
|
2 years ago |
Jiarui Fang
|
a1476ea882
|
[NFC] polish doc style for ColoTensor (#1457)
|
2 years ago |
Geng Zhang
|
9f3eed66eb
|
[FAW] reorganize the inheritance struct of FreqCacheEmbedding (#1448)
|
2 years ago |
Jiarui Fang
|
30b4dd17c0
|
[FAW] export FAW in _ops (#1438)
|
2 years ago |
HELSON
|
1b41686461
|
[hotfix] fix unit test test_module_spec (#1321)
|
2 years ago |
Jiarui Fang
|
9bcd2fd4af
|
[tensor] a shorter shard and replicate spec (#1245)
|
2 years ago |
Jiarui Fang
|
ae7d3f4927
|
[refactor] move process group from _DistSpec to ColoTensor. (#1203)
|
2 years ago |
Jiarui Fang
|
060b917daf
|
[refactor] remove gpc dependency in colotensor's _ops (#1189)
|
2 years ago |
Ziyue Jiang
|
dd0420909f
|
[Tensor] rename parallel_action (#1174)
* rename parallel_action
* polish
|
2 years ago |
Jiarui Fang
|
4b9bba8116
|
[ColoTensor] rename APIs and add output_replicate to ComputeSpec (#1168)
|
2 years ago |
Jiarui Fang
|
f4ef224358
|
[Tensor] remove ParallelAction, use ComputeSpec instread (#1166)
|
2 years ago |
Jiarui Fang
|
49832b2344
|
[refactory] add nn.parallel module (#1068)
|
3 years ago |