Commit Graph

1235 Commits (e8a9bebc8770b9430f4150a400e6fef43cf02d4f)
 

Author SHA1 Message Date
Super Daniel 32efe8e740
[fx] add profiler for fx nodes. (#1480)
2 years ago
Frank Lee d39e11dffb
[autoparallel] added namespace constraints (#1490)
2 years ago
Kirigaya Kazuto a6c8749198
[pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B (#1483)
2 years ago
github-actions[bot] d6e3dca436
Automated submodule synchronization (#1484)
2 years ago
Geng Zhang 0aad53c62b
[FCE] update interface for frequency statistics in FreqCacheEmbedding (#1462)
2 years ago
Frank Lee ede326298b
[autoparallel] integrate auto parallel with torch fx (#1479)
2 years ago
github-actions[bot] 8fb09a950a
Automated submodule synchronization (#1478)
2 years ago
fastalgo 0f438d15ee
Update README.md
2 years ago
Sze-qq 1750d6f573
[doc] update readme with the new xTrimoMultimer project (#1477)
2 years ago
Boyuan Yao 1f2e547f7a
[fx] Fix ckpt functions' definitions in forward (#1476)
2 years ago
Kirigaya Kazuto bb5f5289e0
[pipeline/rpc] implement a demo for PP with cuda rpc framework (#1470)
2 years ago
Frank Lee 628c7e3fc8
[autoparallel] added dot handler (#1475)
2 years ago
github-actions[bot] d08566fb61
Automated submodule synchronization (#1472)
2 years ago
Frank Lee 9dae9bb2bc
[autoparallel] introduced baseclass for op handler and reduced code redundancy (#1471)
2 years ago
Frank Lee 3a54e1c9b7
[autoparallel] standardize the code structure (#1469)
2 years ago
YuliangLiu0306 26a37b5cd5
[autoparallel] Add conv handler to generate strategies and costs info for conv (#1467)
2 years ago
Jiarui Fang 1b491ad7de
[doc] update docstring in ProcessGroup (#1468)
2 years ago
YuliangLiu0306 b73fb7a077
[tensor] support runtime ShardingSpec apply (#1453)
2 years ago
github-actions[bot] 177d3f5718
Automated submodule synchronization (#1465)
2 years ago
Super Daniel bbc58d881b
[fx] fix MetaInfoProp for incorrect calculations and add detections for inplace op. (#1466)
2 years ago
Super Daniel e7383f578b
[fx] add rules to linearize computation graphs for searching. (#1461)
2 years ago
github-actions[bot] a7a3d55114
Automated submodule synchronization (#1452)
2 years ago
Boyuan Yao 092b9c8f49
[fx] Add use_reentrant=False to checkpoint in codegen (#1463)
2 years ago
Boyuan Yao 47fd8e4a02
[utils] Add use_reetrant=False in utils.activation_checkpoint (#1460)
2 years ago
Jiarui Fang 36824a304c
[Doc] add more doc for ColoTensor. (#1458)
2 years ago
Jiarui Fang a1476ea882
[NFC] polish doc style for ColoTensor (#1457)
2 years ago
Super Daniel 0dbd61c29b
[fx] fix test and algorithm bugs in activation checkpointing. (#1451)
2 years ago
Jiarui Fang b1553fdf96
[NFC] global vars should be upper case (#1456)
2 years ago
ver217 367c615818
fix nvme docstring (#1450)
2 years ago
Frank Lee 6474e31556
[workflow] added TensorNVMe to compatibility test (#1449)
2 years ago
Geng Zhang 9f3eed66eb
[FAW] reorganize the inheritance struct of FreqCacheEmbedding (#1448)
2 years ago
Frank Lee 5a52e21fe3
[test] fixed the activation codegen test (#1447)
2 years ago
YuliangLiu0306 0f3042363c
[tensor] shape consistency generate transform path and communication cost (#1435)
2 years ago
Boyuan Yao 5774fe0270
[fx] Use colossalai checkpoint and add offload recognition in codegen (#1439)
2 years ago
Kirigaya Kazuto e9460b45c8
[engin/schedule] use p2p_v2 to recontruct pipeline_schedule (#1408)
2 years ago
Frank Lee ae1b58cd16
[tensor] added linear implementation for the new sharding spec (#1416)
2 years ago
Super Daniel d40a9392ba
[fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174. (#1446)
2 years ago
ver217 821c6172e2
[utils] Impl clip_grad_norm for ColoTensor and ZeroOptimizer (#1442)
2 years ago
Frank Lee 74bee5f7e8
[release] update version.txt (#1444)
2 years ago
HELSON b80340168e
[zero] add chunk_managerV2 for all-gather chunk (#1441)
2 years ago
Super Daniel 3b26516c69
[fx] add vanilla activation checkpoint search with test on resnet and densenet (#1433)
2 years ago
Jiarui Fang 30b4dd17c0
[FAW] export FAW in _ops (#1438)
2 years ago
HELSON 9056677b13
[zero] add chunk size searching algorithm for parameters in different groups (#1436)
2 years ago
Jiarui Fang c9427a323f
hotfix #1434 (#1437)
2 years ago
HELSON 039b7ed3bc
[polish] add update directory in gemini; rename AgChunk to ChunkV2 (#1432)
2 years ago
Super Daniel f20cb4e893
[fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages (#1425)
2 years ago
Jiarui Fang 89c434a0a6
[polish] add test_ops directory (#1431)
2 years ago
Jiarui Fang 10b3df65c8
[FAW] move coloparam setting in test code. (#1429)
2 years ago
Jiarui Fang cb98cf5558
[FAW] parallel FreqAwareEmbedding (#1424)
2 years ago
HELSON 0d212183c4
[zero] add has_inf_or_nan in AgChunk; enhance the unit test of AgChunk (#1426)
2 years ago