Frank Lee
ae1b58cd16
[tensor] added linear implementation for the new sharding spec ( #1416 )
...
* [tensor] added linear implementation for the new sharding spec
* polish code
2 years ago
Jiarui Fang
30b4dd17c0
[FAW] export FAW in _ops ( #1438 )
2 years ago
Jiarui Fang
c9427a323f
hotfix #1434 ( #1437 )
2 years ago
Jiarui Fang
10b3df65c8
[FAW] move coloparam setting in test code. ( #1429 )
2 years ago
Jiarui Fang
cb98cf5558
[FAW] parallel FreqAwareEmbedding ( #1424 )
2 years ago
Jiarui Fang
d209aff684
Add FreqAwareEmbeddingBag ( #1421 )
2 years ago
Jiarui Fang
504419d261
[FAW] add cache manager for the cached embedding ( #1419 )
2 years ago
ver217
12b4887097
[hotfix] fix CPUAdam kernel nullptr ( #1410 )
2 years ago
ver217
04c9a86af8
[zero] ZeroDDP supports controlling outputs' dtype ( #1399 )
2 years ago
HELSON
4e98e938ce
[zero] alleviate memory usage in ZeRODDP state_dict ( #1398 )
2 years ago
HELSON
c7221cb2d4
[hotfix] adapt ProcessGroup and Optimizer to ColoTensor ( #1388 )
2 years ago
ver217
83328329dd
[hotfix] fix zero ddp buffer cast ( #1376 )
...
* fix zero ddp buffer cast
* fix zero ddp ignore params
2 years ago
ver217
5d5031e946
fix zero ddp state dict ( #1378 )
2 years ago
ver217
c415240db6
[nvme] CPUAdam and HybridAdam support NVMe offload ( #1360 )
...
* impl nvme optimizer
* update cpu adam
* add unit test
* update hybrid adam
* update docstr
* add TODOs
* update CI
* fix CI
* fix CI
* fix CI path
* fix CI path
* fix CI path
* fix install tensornvme
* fix CI
* fix CI path
* fix CI env variables
* test CI
* test CI
* fix CI
* fix nvme optim __del__
* fix adam __del__
* fix nvme optim
* fix CI env variables
* fix nvme optim import
* test CI
* test CI
* fix CI
2 years ago
HELSON
87775a0682
[colotensor] use cpu memory to store state_dict ( #1367 )
2 years ago
ver217
d068af81a3
[doc] update rst and docstring ( #1351 )
...
* update rst
* add zero docstr
* fix docstr
* remove fx.tracer.meta_patch
* fix docstr
* fix docstr
* update fx rst
* fix fx docstr
* remove useless rst
2 years ago
HELSON
7a8702c06d
[colotensor] add Tensor.view op and its unit test ( #1343 )
...
[colotensor] add megatron initialization for gpt2
2 years ago
ver217
0c51ff2c13
[hotfix] ZeroDDP use new process group ( #1333 )
...
* process group supports getting ranks in group
* chunk mgr receives a process group
* update unit test
* fix unit tests
2 years ago
HELSON
1b41686461
[hotfix] fix unit test test_module_spec ( #1321 )
2 years ago
Jiarui Fang
9e4c6449b0
[checkpoint] add ColoOptimizer checkpointing ( #1316 )
2 years ago
Jiarui Fang
85f933b58b
[Optimizer] Remove useless ColoOptimizer ( #1312 )
2 years ago
Jiarui Fang
9f10524313
[Optimizer] polish the init method of ColoOptimizer ( #1310 )
2 years ago
HELSON
260a55804a
[hotfix] fix shape error in backward when using ColoTensor ( #1298 )
2 years ago
runluo
f83c4d6597
[NFC] polish colossalai/nn/layer/wrapper/pipeline_wrapper.py code style ( #1303 )
2 years ago
XYE
e83b2ce853
[NFC] polish colossalai/nn/layer/vanilla/layers.py code style ( #1295 )
2 years ago
Liping233
1000a41fd5
[NFC] polish colossalai/nn/layer/vanilla/__init__.py code style ( #1293 )
2 years ago
Wangbo Zhao(黑色枷锁)
552667825b
[NFC] polish colossalai/nn/layer/parallel_1d/layers.py code style ( #1290 )
2 years ago
Jiatong Han
38e3ccd1e9
[NFC] polish colossalai/nn/layer/parallel_sequence/layers.py code style ( #1280 )
...
Co-authored-by: JThh <jiatong.han@u.nus.edu>
2 years ago
Boyuan Yao
b414eaa5db
[NFC] polish colossalai/nn/optimizer/lamb.py code style ( #1275 )
2 years ago
Super Daniel
52d145a342
[NFC] polish colossalai/nn/lr_scheduler/onecycle.py code style ( #1269 )
2 years ago
Geng Zhang
0e06f62160
[NFC] polish colossalai/nn/layer/parallel_sequence/_operation.py code style ( #1266 )
2 years ago
superhao1995
f660152c73
[NFC] polish colossalai/nn/layer/parallel_3d/_operation.py code style ( #1258 )
...
Co-authored-by: Research <research@soccf-snr3-017.comp.nus.edu.sg>
2 years ago
Thunderbeee
9738fb0f78
[NFC] polish colossalai/nn/lr_scheduler/__init__.py ( #1255 )
...
code style
2 years ago
Ofey Chan
2dd4d556fb
[NFC] polish colossalai/nn/init.py code style ( #1292 )
2 years ago
HELSON
abba4d84e1
[hotfix] fix bert model test in unitests ( #1272 )
2 years ago
oahzxl
0cf8e8e91c
[NFC] polish <colossalai/nn/lr_scheduler/poly.py> code style ( #1267 )
2 years ago
Jiarui Fang
1aad903c15
[tensor] redistribute among different process groups ( #1247 )
...
* make it faster
* [tensor] rename convert_to_dist -> redistribute
* [tensor] ShardSpec and ReplicaSpec
* [tensor] redistribute among diff pgs
* polish code
2 years ago
Jiarui Fang
9bcd2fd4af
[tensor] a shorter shard and replicate spec ( #1245 )
2 years ago
Jiarui Fang
2699dfbbfd
[rename] convert_to_dist -> redistribute ( #1243 )
2 years ago
Jiarui Fang
4a76084dc9
[tensor] add zero_like colo op, important for Optimizer ( #1236 )
2 years ago
Jiarui Fang
3b500984b1
[tensor] fix some unittests ( #1234 )
2 years ago
HELSON
0453776def
[tensor] fix a assertion in colo_tensor cross_entropy ( #1232 )
2 years ago
HELSON
42ab36b762
[tensor] add unitest for colo_tensor 1DTP cross_entropy ( #1230 )
2 years ago
Yi Zhao
04537bf83e
[checkpoint]support generalized scheduler ( #1222 )
2 years ago
Jiarui Fang
a98319f023
[tensor] torch function return colotensor ( #1229 )
2 years ago
Jiarui Fang
ae7d3f4927
[refactor] move process group from _DistSpec to ColoTensor. ( #1203 )
2 years ago
Jiarui Fang
b5f25eb32a
[Tensor] add cpu group to ddp ( #1200 )
2 years ago
Jiarui Fang
060b917daf
[refactor] remove gpc dependency in colotensor's _ops ( #1189 )
2 years ago
Jiarui Fang
372f791444
[refactor] move chunk and chunkmgr to directory gemini ( #1182 )
2 years ago
ver217
6b2f2ab9bb
[ddp] ColoDDP uses bucket all-reduce ( #1177 )
...
* add reducer
* update colo ddp with reducer
* polish unit test
* polish unit test
2 years ago