* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] merge development into main (#1)
* [fx] activation checkpointing using Chen strategies.
* [fx] add test for ckpt_solver_chen
* [fx] add vanilla activation checkpoint search with test on resnet and densenet
* [fx] add a namespace code for solver_chen.
* [fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174.
* [fx] fix lowercase naming conventions.
* [fx] simplify test for ckpt.
* [fx] add rules to linearize computation graphs for searching. (#2)
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] merge development into main (#1)
* [fx] activation checkpointing using Chen strategies.
* [fx] add test for ckpt_solver_chen
* [fx] add vanilla activation checkpoint search with test on resnet and densenet
* [fx] add a namespace code for solver_chen.
* [fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174.
* [fx] fix lowercase naming conventions.
* [fx] simplify test for ckpt.
* [fx] fix test and algorithm bugs in activation checkpointing.
* [fx] polish ckpt_test.
* [fx] add rules to linearize computation graphs for searching.
* [fx] remove chen_sqrt for sake of simplicity
* [fx] remove chen_sqrt for sake of simplicity
* [fx] remove chen_sqrt for sake of simplicity
* [fx] remove chen_sqrt for sake of simplicity
* [fx] fix inconsistencies.
* [fx] fix MetaInfoProp.
* [fx] fix MetaInfoProp.
* [fx] consider MetaInfoProp for inplace operands.
* [fx] consider MetaInfoProp for inplace operands.
* [fx] consider MetaInfoProp for inplace operands.
* [fx] consider MetaInfoProp for inplace operands.
* [fx] consider MetaInfoProp for inplace operands.
* [fx] add profiler for fx nodes.
* [fx] add profiler for fx nodes.
* [fx] add profiler for fx nodes.
* [fx] add profiler for fx nodes.
* [fx] add profiler for fx nodes.
* [fx] add profiler for fx nodes.
* [fx] add profiler for fx nodes.
* [fx] fix error in tests.
* [fx] unfix bug.
* [fx] unfix bug.
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] merge development into main (#1)
* [fx] activation checkpointing using Chen strategies.
* [fx] add test for ckpt_solver_chen
* [fx] add vanilla activation checkpoint search with test on resnet and densenet
* [fx] add a namespace code for solver_chen.
* [fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174.
* [fx] fix lowercase naming conventions.
* [fx] simplify test for ckpt.
* [fx] add rules to linearize computation graphs for searching. (#2)
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] merge development into main (#1)
* [fx] activation checkpointing using Chen strategies.
* [fx] add test for ckpt_solver_chen
* [fx] add vanilla activation checkpoint search with test on resnet and densenet
* [fx] add a namespace code for solver_chen.
* [fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174.
* [fx] fix lowercase naming conventions.
* [fx] simplify test for ckpt.
* [fx] fix test and algorithm bugs in activation checkpointing.
* [fx] polish ckpt_test.
* [fx] add rules to linearize computation graphs for searching.
* [fx] remove chen_sqrt for sake of simplicity
* [fx] remove chen_sqrt for sake of simplicity
* [fx] remove chen_sqrt for sake of simplicity
* [fx] remove chen_sqrt for sake of simplicity
* [fx] fix inconsistencies.
* [fx] fix MetaInfoProp.
* [fx] fix MetaInfoProp.
* [fx] consider MetaInfoProp for inplace operands.
* [fx] consider MetaInfoProp for inplace operands.
* [fx] consider MetaInfoProp for inplace operands.
* [fx] consider MetaInfoProp for inplace operands.
* [fx] consider MetaInfoProp for inplace operands.
* [utils] Add use_reetrant=False into colossalai checkpoint
* [utils] add some annotation in utils.activaion_checkpoint
* [test] add reset_seed at the beginning of tests in test_actiavion_checkpointing.py
* [test] modify test_activation_checkpoint.py
* [test] modify test for reentrant=False
* [fx] Add use_reentrant=False of checkpoint into codegen
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] merge development into main (#1)
* [fx] activation checkpointing using Chen strategies.
* [fx] add test for ckpt_solver_chen
* [fx] add vanilla activation checkpoint search with test on resnet and densenet
* [fx] add a namespace code for solver_chen.
* [fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174.
* [fx] fix lowercase naming conventions.
* [fx] simplify test for ckpt.
* [fx] fix test and algorithm bugs in activation checkpointing.
* mend
[fx] fix test and algorithm bugs in activation checkpointing.
* mend
[fx] fix test and algorithm bugs in activation checkpointing.
* mend
[fx] fix test and algorithm bugs in activation checkpointing.
* mend
[fx] fix test and algorithm bugs in activation checkpointing.
* [fx] polish ckpt_test.
* [fx] polish ckpt_test.
* [fx] polish ckpt_test.
* [fx] Use colossalai.utils.checkpoint to replace torch.utils.checkpoint for offload activation and add offload annotation recognition in codegen
* [fx] Use colossalai.utils.checkpoint to replace torch.utils.checkpoint for offload activation and add offload annotation recognition in codegen
* Modification of test and add TODO in codegen
* [fx] Modification of colossal ckpt usage
* [fx] add gpc.destroy() to test_codegen
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] activation checkpointing using Chen strategies.
* [fx] add test for ckpt_solver_chen
* mend
* [fx] add vanilla activation checkpoint search with test on resnet and densenet
* [fx] add vanilla activation checkpoint search with test on resnet and densenet
* [fx] add a namespace code for solver_chen.
* [fx] fix the false interpretation of algorithm 3 in https://arxiv.org/abs/1604.06174.
* [fx] fix lowercase naming conventions.
* [fx] activation checkpointing using Chen strategies.
* [fx] add test for ckpt_solver_chen
* [fx] add vanilla activation checkpoint search with test on resnet and densenet
* [fx] add vanilla activation checkpoint search with test on resnet and densenet
* [fx] add a namespace code for solver_chen.
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [fx] modify the calculation of node_size in MetaInfoProp for activation checkpointing usages
* [CLI] add CLI launcher
* Revert "[CLI] add CLI launcher"
This reverts commit df7e6506d4.
* manipulation
* [fx]add graph manipulation methods.
* [fx]methods to get fx graph property.
* add unit test
* add docstring to explain top node and leaf node in this context
* init a checkpoint dir
* [checkpoint]support resume for cosinewarmuplr
* [checkpoint]add unit test
* fix some bugs but still not OK
* fix bugs
* make it faster
* [checkpoint]support generalized scheduler
* polish
* [tensor] torch function return colotensor
* polish
* fix bugs
* remove debug info
* polish
* polish
* [tensor] test_model pass unittests
* polish
* [hotfix] fx get comm size bug
Co-authored-by: ZhaoYi1222 <zhaoyi9499@gmail.com>