ver217
ae71036cd2
[utils] refactor parallel layers checkpoint and bcast model on loading checkpoint ( #1548 )
...
* refactor parallel layer
* broadcast rank0 model after load ckpt
2 years ago
Jiarui Fang
64169f3e8f
[embedding] polish parallel embedding tablewise ( #1545 )
2 years ago
CsRic
964123ae0f
[embedding] freq_aware_embedding: add small functions for caller application ( #1537 )
2 years ago
Jiarui Fang
521078ffc9
[embedding] fix a bug in table wise sharding ( #1538 )
2 years ago
Jiarui Fang
87134524fd
[embedding] tablewise sharding polish ( #1535 )
2 years ago
CsRic
5156d5b4f8
[embedding] add tablewise sharding for FAW ( #1526 )
2 years ago
Jiarui Fang
4537d39df9
[doc] docstring for FreqAwareEmbeddingBag ( #1525 )
2 years ago
Jiarui Fang
9a9ef65313
[FAW] cpu caching operations ( #1520 )
2 years ago
Jiarui Fang
af5438caa2
[FAW] refactor reorder() for CachedParamMgr ( #1514 )
2 years ago
Jiarui Fang
9feee6d06b
[FAW] LFU initialize with dataset freq ( #1513 )
2 years ago
CsRic
1b8fee8e9c
[FAW] shrink freq_cnter size ( #1509 )
2 years ago
Jiarui Fang
ba61109b6c
[FAW] remove code related to chunk ( #1501 )
2 years ago
Jiarui Fang
d5085bb317
[FAW] add more docs and fix a warning ( #1500 )
2 years ago
CsRic
0ed2f46131
[FAW] FAW embedding use LRU as eviction strategy intialized with dataset stats ( #1494 )
2 years ago
CsRic
b8d0e39eaf
[FAW] LFU cache for the FAW
2 years ago
Jiarui Fang
cde7b8a5b8
[FAW] init an LFU implementation for FAW ( #1488 )
2 years ago
Geng Zhang
0aad53c62b
[FCE] update interface for frequency statistics in FreqCacheEmbedding ( #1462 )
2 years ago
Jiarui Fang
a1476ea882
[NFC] polish doc style for ColoTensor ( #1457 )
2 years ago
ver217
367c615818
fix nvme docstring ( #1450 )
2 years ago
Geng Zhang
9f3eed66eb
[FAW] reorganize the inheritance struct of FreqCacheEmbedding ( #1448 )
2 years ago
Frank Lee
ae1b58cd16
[tensor] added linear implementation for the new sharding spec ( #1416 )
...
* [tensor] added linear implementation for the new sharding spec
* polish code
2 years ago
Jiarui Fang
30b4dd17c0
[FAW] export FAW in _ops ( #1438 )
2 years ago
Jiarui Fang
c9427a323f
hotfix #1434 ( #1437 )
2 years ago
Jiarui Fang
10b3df65c8
[FAW] move coloparam setting in test code. ( #1429 )
2 years ago
Jiarui Fang
cb98cf5558
[FAW] parallel FreqAwareEmbedding ( #1424 )
2 years ago
Jiarui Fang
d209aff684
Add FreqAwareEmbeddingBag ( #1421 )
2 years ago
Jiarui Fang
504419d261
[FAW] add cache manager for the cached embedding ( #1419 )
2 years ago
ver217
12b4887097
[hotfix] fix CPUAdam kernel nullptr ( #1410 )
2 years ago
ver217
04c9a86af8
[zero] ZeroDDP supports controlling outputs' dtype ( #1399 )
2 years ago
HELSON
4e98e938ce
[zero] alleviate memory usage in ZeRODDP state_dict ( #1398 )
2 years ago
HELSON
c7221cb2d4
[hotfix] adapt ProcessGroup and Optimizer to ColoTensor ( #1388 )
2 years ago
ver217
83328329dd
[hotfix] fix zero ddp buffer cast ( #1376 )
...
* fix zero ddp buffer cast
* fix zero ddp ignore params
2 years ago
ver217
5d5031e946
fix zero ddp state dict ( #1378 )
2 years ago
ver217
c415240db6
[nvme] CPUAdam and HybridAdam support NVMe offload ( #1360 )
...
* impl nvme optimizer
* update cpu adam
* add unit test
* update hybrid adam
* update docstr
* add TODOs
* update CI
* fix CI
* fix CI
* fix CI path
* fix CI path
* fix CI path
* fix install tensornvme
* fix CI
* fix CI path
* fix CI env variables
* test CI
* test CI
* fix CI
* fix nvme optim __del__
* fix adam __del__
* fix nvme optim
* fix CI env variables
* fix nvme optim import
* test CI
* test CI
* fix CI
2 years ago
HELSON
87775a0682
[colotensor] use cpu memory to store state_dict ( #1367 )
2 years ago
ver217
d068af81a3
[doc] update rst and docstring ( #1351 )
...
* update rst
* add zero docstr
* fix docstr
* remove fx.tracer.meta_patch
* fix docstr
* fix docstr
* update fx rst
* fix fx docstr
* remove useless rst
2 years ago
HELSON
7a8702c06d
[colotensor] add Tensor.view op and its unit test ( #1343 )
...
[colotensor] add megatron initialization for gpt2
2 years ago
ver217
0c51ff2c13
[hotfix] ZeroDDP use new process group ( #1333 )
...
* process group supports getting ranks in group
* chunk mgr receives a process group
* update unit test
* fix unit tests
2 years ago
HELSON
1b41686461
[hotfix] fix unit test test_module_spec ( #1321 )
2 years ago
Jiarui Fang
9e4c6449b0
[checkpoint] add ColoOptimizer checkpointing ( #1316 )
2 years ago
Jiarui Fang
85f933b58b
[Optimizer] Remove useless ColoOptimizer ( #1312 )
2 years ago
Jiarui Fang
9f10524313
[Optimizer] polish the init method of ColoOptimizer ( #1310 )
2 years ago
HELSON
260a55804a
[hotfix] fix shape error in backward when using ColoTensor ( #1298 )
2 years ago
runluo
f83c4d6597
[NFC] polish colossalai/nn/layer/wrapper/pipeline_wrapper.py code style ( #1303 )
2 years ago
XYE
e83b2ce853
[NFC] polish colossalai/nn/layer/vanilla/layers.py code style ( #1295 )
2 years ago
Liping233
1000a41fd5
[NFC] polish colossalai/nn/layer/vanilla/__init__.py code style ( #1293 )
2 years ago
Wangbo Zhao(黑色枷锁)
552667825b
[NFC] polish colossalai/nn/layer/parallel_1d/layers.py code style ( #1290 )
2 years ago
Jiatong Han
38e3ccd1e9
[NFC] polish colossalai/nn/layer/parallel_sequence/layers.py code style ( #1280 )
...
Co-authored-by: JThh <jiatong.han@u.nus.edu>
2 years ago
Boyuan Yao
b414eaa5db
[NFC] polish colossalai/nn/optimizer/lamb.py code style ( #1275 )
2 years ago
Super Daniel
52d145a342
[NFC] polish colossalai/nn/lr_scheduler/onecycle.py code style ( #1269 )
2 years ago