HELSON
|
f7f2248771
|
[moe] fix MoE bugs (#1628)
* remove forced FP32 modules
* correct no_shard-contexts' positions
|
2022-09-22 13:56:30 +08:00 |
Jiarui Fang
|
38c68b5b9a
|
[embedding] rollback for better FAW performance (#1625)
|
2022-09-22 11:16:25 +08:00 |
Jiarui Fang
|
504ff1d101
|
[embeddings] use cache_ratio instead of cuda_row_num (#1611)
|
2022-09-20 14:33:04 +08:00 |
Jiarui Fang
|
a19eb80998
|
[embedding] updates some default parameters
|
2022-09-15 15:45:17 +08:00 |
CsRic
|
f3403ff98e
|
[embeddings] add already_split_along_rank flag for tablewise mode (#1584)
|
2022-09-13 10:50:34 +08:00 |
Sze-qq
|
2144cbae8c
|
[NFC] polish colossalai/nn/lr_scheduler/multistep.py code style (#1572)
|
2022-09-08 22:11:04 +08:00 |
superhao1995
|
e4bf7ae667
|
[NFC] polish colossalai/nn/lr_scheduler/torch.py code style (#1571)
Co-authored-by: Research <research@soccf-snr3-017.comp.nus.edu.sg>
|
2022-09-08 22:11:04 +08:00 |
Jiatong Han
|
3263cdf57f
|
[NFC] polish colossalai/nn/parallel/data_parallel.py code style (#1570)
Co-authored-by: JThh <jiatong.han@u.nus.edu>
|
2022-09-08 22:11:04 +08:00 |
DouJS
|
f586887a90
|
[NFC] polish colossalai/nn/layer/colossalai_layer/dropout.py code style (#1568)
|
2022-09-08 22:11:04 +08:00 |
BigOneLiXiaoMing
|
0c4c9aa6e0
|
[NFC] polish colossalai/nn/_ops/embedding.py code style (#1561)
|
2022-09-08 22:11:04 +08:00 |
Ofey Chan
|
7cc052f6c0
|
[NFC] polish colossalai/nn/layer/colossalai_layer/linear.py (#1556)
|
2022-09-08 22:11:04 +08:00 |
yuxuan-lou
|
413f9c19f4
|
[NFC] polish colossalai/nn/_ops/layernorm.py code style (#1555)
|
2022-09-08 22:11:04 +08:00 |
shenggan
|
8edb777cc2
|
[NFC] polish colossalai/nn/loss/loss_2p5d.py code style (#1553)
|
2022-09-08 22:11:04 +08:00 |
Maruyama_Aya
|
bd2d789832
|
[NFC] polish colossalai/nn/_ops/embedding_bag.py code style (#1552)
|
2022-09-08 22:11:04 +08:00 |
binmakeswell
|
73e9eb13b7
|
[NFC] polish colossalai/nn/lr_scheduler/cosine.py code style
|
2022-09-08 22:11:04 +08:00 |
CsRic
|
a389ac4ec9
|
[embedding] cache_embedding small improvement (#1564)
|
2022-09-08 16:41:19 +08:00 |
ver217
|
10dd8226b1
|
add gather_output for VocabParallelClassifier1D (#1569)
|
2022-09-08 16:40:56 +08:00 |
ver217
|
ae71036cd2
|
[utils] refactor parallel layers checkpoint and bcast model on loading checkpoint (#1548)
* refactor parallel layer
* broadcast rank0 model after load ckpt
|
2022-09-06 20:18:35 +08:00 |
Jiarui Fang
|
64169f3e8f
|
[embedding] polish parallel embedding tablewise (#1545)
|
2022-09-06 10:41:20 +08:00 |
CsRic
|
964123ae0f
|
[embedding] freq_aware_embedding: add small functions for caller application (#1537)
|
2022-09-05 15:12:53 +08:00 |
Jiarui Fang
|
521078ffc9
|
[embedding] fix a bug in table wise sharding (#1538)
|
2022-09-02 15:48:35 +08:00 |
Jiarui Fang
|
87134524fd
|
[embedding] tablewise sharding polish (#1535)
|
2022-09-02 11:09:37 +08:00 |
CsRic
|
5156d5b4f8
|
[embedding] add tablewise sharding for FAW (#1526)
|
2022-09-01 17:55:41 +08:00 |
Jiarui Fang
|
4537d39df9
|
[doc] docstring for FreqAwareEmbeddingBag (#1525)
|
2022-08-31 13:52:30 +08:00 |
Jiarui Fang
|
9a9ef65313
|
[FAW] cpu caching operations (#1520)
|
2022-08-30 14:50:02 +08:00 |
Jiarui Fang
|
af5438caa2
|
[FAW] refactor reorder() for CachedParamMgr (#1514)
|
2022-08-29 14:22:07 +08:00 |
Jiarui Fang
|
9feee6d06b
|
[FAW] LFU initialize with dataset freq (#1513)
|
2022-08-29 12:52:53 +08:00 |
CsRic
|
1b8fee8e9c
|
[FAW] shrink freq_cnter size (#1509)
|
2022-08-29 11:44:55 +08:00 |
Jiarui Fang
|
ba61109b6c
|
[FAW] remove code related to chunk (#1501)
|
2022-08-26 14:23:30 +08:00 |
Jiarui Fang
|
d5085bb317
|
[FAW] add more docs and fix a warning (#1500)
|
2022-08-26 14:10:21 +08:00 |
CsRic
|
0ed2f46131
|
[FAW] FAW embedding use LRU as eviction strategy intialized with dataset stats (#1494)
|
2022-08-26 11:24:12 +08:00 |
CsRic
|
b8d0e39eaf
|
[FAW] LFU cache for the FAW
|
2022-08-25 13:08:46 +08:00 |
Jiarui Fang
|
cde7b8a5b8
|
[FAW] init an LFU implementation for FAW (#1488)
|
2022-08-24 17:37:22 +08:00 |
Geng Zhang
|
0aad53c62b
|
[FCE] update interface for frequency statistics in FreqCacheEmbedding (#1462)
|
2022-08-23 17:38:24 +08:00 |
Jiarui Fang
|
a1476ea882
|
[NFC] polish doc style for ColoTensor (#1457)
|
2022-08-16 09:21:05 +08:00 |
ver217
|
367c615818
|
fix nvme docstring (#1450)
|
2022-08-12 18:01:02 +08:00 |
Geng Zhang
|
9f3eed66eb
|
[FAW] reorganize the inheritance struct of FreqCacheEmbedding (#1448)
|
2022-08-12 15:55:46 +08:00 |
Frank Lee
|
ae1b58cd16
|
[tensor] added linear implementation for the new sharding spec (#1416)
* [tensor] added linear implementation for the new sharding spec
* polish code
|
2022-08-12 11:33:09 +08:00 |
Jiarui Fang
|
30b4dd17c0
|
[FAW] export FAW in _ops (#1438)
|
2022-08-11 13:43:24 +08:00 |
Jiarui Fang
|
c9427a323f
|
hotfix #1434 (#1437)
|
2022-08-11 13:14:25 +08:00 |
Jiarui Fang
|
10b3df65c8
|
[FAW] move coloparam setting in test code. (#1429)
|
2022-08-10 14:31:53 +08:00 |
Jiarui Fang
|
cb98cf5558
|
[FAW] parallel FreqAwareEmbedding (#1424)
|
2022-08-10 13:44:30 +08:00 |
Jiarui Fang
|
d209aff684
|
Add FreqAwareEmbeddingBag (#1421)
|
2022-08-09 16:26:12 +08:00 |
Jiarui Fang
|
504419d261
|
[FAW] add cache manager for the cached embedding (#1419)
|
2022-08-09 15:17:17 +08:00 |
ver217
|
12b4887097
|
[hotfix] fix CPUAdam kernel nullptr (#1410)
|
2022-08-05 19:45:45 +08:00 |
ver217
|
04c9a86af8
|
[zero] ZeroDDP supports controlling outputs' dtype (#1399)
|
2022-08-02 17:49:11 +08:00 |
HELSON
|
4e98e938ce
|
[zero] alleviate memory usage in ZeRODDP state_dict (#1398)
|
2022-08-02 15:49:13 +08:00 |
HELSON
|
c7221cb2d4
|
[hotfix] adapt ProcessGroup and Optimizer to ColoTensor (#1388)
|
2022-07-29 19:33:24 +08:00 |
ver217
|
83328329dd
|
[hotfix] fix zero ddp buffer cast (#1376)
* fix zero ddp buffer cast
* fix zero ddp ignore params
|
2022-07-28 10:54:44 +08:00 |
ver217
|
5d5031e946
|
fix zero ddp state dict (#1378)
|
2022-07-28 09:31:42 +08:00 |