Commit Graph

613 Commits (d49708ae432f1d38ec806bf7ecea7d0f332a20b1)

Author SHA1 Message Date
HELSON d49708ae43
[hotfix] fix ddp for unit test test_gpt2 (#1326)
2 years ago
Frank Lee 250be4d31e
[utils] integrated colotensor with lazy init context (#1324)
2 years ago
YuliangLiu0306 e8acf55e8b
[fx] add balanced policy v2 (#1251)
2 years ago
XYE ca2d3f284f
[fx] Add unit test and fix bugs for transform_mlp_pass (#1299)
2 years ago
HELSON 1b41686461
[hotfix] fix unit test test_module_spec (#1321)
2 years ago
Jiarui Fang 9e4c6449b0
[checkpoint] add ColoOptimizer checkpointing (#1316)
2 years ago
ver217 7c70bfbefa
[hotfix] fix PipelineSharedModuleGradientHandler (#1314)
2 years ago
Jiarui Fang 85f933b58b
[Optimizer] Remove useless ColoOptimizer (#1312)
2 years ago
Jiarui Fang 9f10524313
[Optimizer] polish the init method of ColoOptimizer (#1310)
2 years ago
Jiarui Fang 3ef3791a3b
[checkpoint] add test for bert and hotfix save bugs (#1297)
2 years ago
Frank Lee 4f4d8c3656
[fx] added apex normalization to patched modules (#1300)
2 years ago
Jiarui Fang 4165eabb1e
[hotfix] remove potiential circle import (#1307)
2 years ago
HELSON 260a55804a
[hotfix] fix shape error in backward when using ColoTensor (#1298)
2 years ago
runluo f83c4d6597
[NFC] polish colossalai/nn/layer/wrapper/pipeline_wrapper.py code style (#1303)
2 years ago
binmakeswell 7696cead8d Recover kernal files
2 years ago
XYE e83b2ce853 [NFC] polish colossalai/nn/layer/vanilla/layers.py code style (#1295)
2 years ago
Liping233 1000a41fd5 [NFC] polish colossalai/nn/layer/vanilla/__init__.py code style (#1293)
2 years ago
Maruyama_Aya 87f679aeae [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/kernels.h code style (#1291)
2 years ago
Wangbo Zhao(黑色枷锁) 552667825b [NFC] polish colossalai/nn/layer/parallel_1d/layers.py code style (#1290)
2 years ago
doubleHU d6f5ef8860 [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/transform_kernels.cu code style (#1286)
2 years ago
Ziheng Qin 6d6c01e94d [NFC] polish colossalai/__init__.py code style (#1285)
2 years ago
Jiatong Han 38e3ccd1e9 [NFC] polish colossalai/nn/layer/parallel_sequence/layers.py code style (#1280)
2 years ago
Boyuan Yao b414eaa5db [NFC] polish colossalai/nn/optimizer/lamb.py code style (#1275)
2 years ago
yuxuan-lou 5f6ab35d25 Hotfix/format (#1274)
2 years ago
Super Daniel 52d145a342 [NFC] polish colossalai/nn/lr_scheduler/onecycle.py code style (#1269)
2 years ago
Geng Zhang 0e06f62160 [NFC] polish colossalai/nn/layer/parallel_sequence/_operation.py code style (#1266)
2 years ago
binmakeswell c95e18cdb9 [NFC] polish colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.h code style (#1270)
2 years ago
xyupeng 94bfd35184 [NFC] polish colossalai/builder/builder.py code style (#1265)
2 years ago
DouJS db13f96333 [NFC] polish colossalai/kernel/cuda_native/csrc/multi_tensor_apply.cuh code style (#1264)
2 years ago
shenggan 5d7366b144 [NFC] polish colossalai/kernel/cuda_native/csrc/scaled_masked_softmax.h code style (#1263)
2 years ago
Zangwei Zheng 197a2c89e2 [NFC] polish colossalai/communication/collective.py (#1262)
2 years ago
ziyu huang f1cafcc73a [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/dropout_kernels.cu code style (#1261)
2 years ago
Sze-qq f8b9aaef47 [NFC] polish colossalai/kernel/cuda_native/csrc/type_shim.h code style (#1260)
2 years ago
superhao1995 f660152c73 [NFC] polish colossalai/nn/layer/parallel_3d/_operation.py code style (#1258)
2 years ago
Thunderbeee 9738fb0f78 [NFC] polish colossalai/nn/lr_scheduler/__init__.py (#1255)
2 years ago
Kai Wang (Victor Kai) 50f2ad213f [NFC] polish colossalai/engine/ophooks/utils.py code style (#1256)
2 years ago
Ofey Chan 2dd4d556fb
[NFC] polish colossalai/nn/init.py code style (#1292)
2 years ago
Jiarui Fang 556b9b7e1a
[hotfix] Dist Mgr gather torch version (#1284)
2 years ago
HELSON abba4d84e1
[hotfix] fix bert model test in unitests (#1272)
2 years ago
ver217 7aadcbd070
hotfix colotensor _scan_for_pg_from_args (#1276)
2 years ago
oahzxl 0cf8e8e91c
[NFC] polish <colossalai/nn/lr_scheduler/poly.py> code style (#1267)
2 years ago
Jiarui Fang c92f84fcdb
[tensor] distributed checkpointing for parameters (#1240)
2 years ago
Frank Lee fb35460595
[fx] added ndim property to proxy (#1253)
2 years ago
Frank Lee 4a09fc0947
[fx] fixed tracing with apex-based T5 model (#1252)
2 years ago
Frank Lee 7531c6271f
[fx] refactored the file structure of patched function and module (#1238)
2 years ago
YuliangLiu0306 17ed33350b
[hotfix] fix an assertion bug in base schedule. (#1250)
2 years ago
YuliangLiu0306 97d713855a
[fx] methods to get fx graph property. (#1246)
2 years ago
YuliangLiu0306 30b4fc0eb0
[fx]add split module pass and unit test from pipeline passes (#1242)
2 years ago
Jiarui Fang 1aad903c15
[tensor] redistribute among different process groups (#1247)
2 years ago
Jiarui Fang 9bcd2fd4af
[tensor] a shorter shard and replicate spec (#1245)
2 years ago