Commit Graph

413 Commits (c7d4932956175bb42461d63b00e31ecca113a4d7)

Author SHA1 Message Date
アマデウス e615cfc3a8
[NFC] polish test component gpt code style (#1567)
2 years ago
Kirigaya Kazuto 6159d45417
[pipeline/tuning] improve dispatch performance both time and space cost (#1544)
2 years ago
Super Daniel 4f59693207
[fx] provide a stable but not accurate enough version of profiler. (#1547)
2 years ago
YuliangLiu0306 0908d0fc61
[autoparallel]add backward cost info into strategies (#1524)
2 years ago
YuliangLiu0306 44c866a3e3
[autoparallel] change the merge node logic (#1533)
2 years ago
Jiarui Fang 64169f3e8f
[embedding] polish parallel embedding tablewise (#1545)
2 years ago
CsRic 964123ae0f
[embedding] freq_aware_embedding: add small functions for caller application (#1537)
2 years ago
Boyuan Yao 56159049e8
[fx] Modify solver linearize and add corresponding test (#1531)
2 years ago
Super Daniel 7dc53237c3
[fx] add test for meta tensor. (#1527)
2 years ago
YuliangLiu0306 4b3d6caeb3
[fx]patch nn.functional convolution (#1528)
2 years ago
CsRic 5156d5b4f8
[embedding] add tablewise sharding for FAW (#1526)
2 years ago
Kirigaya Kazuto f1e1836218
[pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508)
2 years ago
Boyuan Yao b231430bcb
[fx] Fix wrong index in annotation and minimal flops in ckpt solver (#1521)
2 years ago
YuliangLiu0306 3345c6d352
[autoparellel]add strategies constructor (#1505)
2 years ago
Frank Lee a0436a62ee
[autoparallel] added liveness analysis (#1516)
2 years ago
Jiarui Fang 9a9ef65313
[FAW] cpu caching operations (#1520)
2 years ago
Jiarui Fang af5438caa2
[FAW] refactor reorder() for CachedParamMgr (#1514)
2 years ago
CsRic 1b8fee8e9c
[FAW] shrink freq_cnter size (#1509)
2 years ago
Boyuan Yao 4acc58ee20
[fx] Fix activation codegen dealing with checkpointing first op (#1510)
2 years ago
Kirigaya Kazuto 5a6fd71f90
[pipeline/rpc] update outstanding mechanism | optimize dispatching strategy (#1497)
2 years ago
CsRic 0ed2f46131
[FAW] FAW embedding use LRU as eviction strategy intialized with dataset stats (#1494)
2 years ago
YuliangLiu0306 8b7d6bd5be
[autoparallel] add more sharding strategies to conv (#1487)
2 years ago
Boyuan Yao de1e716dc4
[fx] Add activation checkpoint solver rotor (#1496)
2 years ago
YuliangLiu0306 413c053453
[autoparallel] add cost graph class (#1481)
2 years ago
YuliangLiu0306 4b03c25f85
[tensor]add 1D device mesh (#1492)
2 years ago
CsRic b8d0e39eaf
[FAW] LFU cache for the FAW
2 years ago
Kirigaya Kazuto 9145aef2b4
[pipeline/rpc] implement distributed optimizer | test with assert_close (#1486)
2 years ago
Frank Lee 3da68d6b1b
[fx] fixed adapative pooling size concatenation error (#1489)
2 years ago
Jiarui Fang cde7b8a5b8
[FAW] init an LFU implementation for FAW (#1488)
2 years ago
Super Daniel 32efe8e740
[fx] add profiler for fx nodes. (#1480)
2 years ago
Kirigaya Kazuto a6c8749198
[pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B (#1483)
2 years ago
Geng Zhang 0aad53c62b
[FCE] update interface for frequency statistics in FreqCacheEmbedding (#1462)
2 years ago
Frank Lee ede326298b
[autoparallel] integrate auto parallel with torch fx (#1479)
2 years ago
Boyuan Yao 1f2e547f7a
[fx] Fix ckpt functions' definitions in forward (#1476)
2 years ago
Kirigaya Kazuto bb5f5289e0
[pipeline/rpc] implement a demo for PP with cuda rpc framework (#1470)
2 years ago
Frank Lee 628c7e3fc8
[autoparallel] added dot handler (#1475)
2 years ago
YuliangLiu0306 26a37b5cd5
[autoparallel] Add conv handler to generate strategies and costs info for conv (#1467)
2 years ago
YuliangLiu0306 b73fb7a077
[tensor] support runtime ShardingSpec apply (#1453)
2 years ago
Super Daniel e7383f578b
[fx] add rules to linearize computation graphs for searching. (#1461)
2 years ago
Boyuan Yao 092b9c8f49
[fx] Add use_reentrant=False to checkpoint in codegen (#1463)
2 years ago
Boyuan Yao 47fd8e4a02
[utils] Add use_reetrant=False in utils.activation_checkpoint (#1460)
2 years ago
Jiarui Fang 36824a304c
[Doc] add more doc for ColoTensor. (#1458)
2 years ago
Super Daniel 0dbd61c29b
[fx] fix test and algorithm bugs in activation checkpointing. (#1451)
2 years ago
Geng Zhang 9f3eed66eb
[FAW] reorganize the inheritance struct of FreqCacheEmbedding (#1448)
2 years ago
Frank Lee 5a52e21fe3
[test] fixed the activation codegen test (#1447)
2 years ago
YuliangLiu0306 0f3042363c
[tensor] shape consistency generate transform path and communication cost (#1435)
2 years ago
Boyuan Yao 5774fe0270
[fx] Use colossalai checkpoint and add offload recognition in codegen (#1439)
2 years ago
Kirigaya Kazuto e9460b45c8
[engin/schedule] use p2p_v2 to recontruct pipeline_schedule (#1408)
2 years ago
Frank Lee ae1b58cd16
[tensor] added linear implementation for the new sharding spec (#1416)
2 years ago
Super Daniel d40a9392ba
[fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174. (#1446)
2 years ago