You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ColossalAI/colossalai
YuliangLiu0306 0908d0fc61
[autoparallel]add backward cost info into strategies (#1524)
2 years ago
..
amp [doc] update rst and docstring (#1351) 2 years ago
auto_parallel [autoparallel]add backward cost info into strategies (#1524) 2 years ago
builder [NFC] polish colossalai/builder/builder.py code style (#1265) 2 years ago
cli [hotfix] fix some bugs caused by size mismatch. (#1011) 3 years ago
communication [communication] add p2p_v2.py to support communication with List[Any] (#1407) 2 years ago
context [doc] update rst and docstring (#1351) 2 years ago
device [tensor]add 1D device mesh (#1492) 2 years ago
engine [engin/schedule] use p2p_v2 to recontruct pipeline_schedule (#1408) 2 years ago
fx [hotfix] change namespace for meta_trace. (#1541) 2 years ago
gemini [zero] add chunk_managerV2 for all-gather chunk (#1441) 2 years ago
kernel [hotfix] fix CPUAdam kernel nullptr (#1410) 2 years ago
logging [doc] improved docstring in the logging module (#861) 3 years ago
nn [utils] refactor parallel layers checkpoint and bcast model on loading checkpoint (#1548) 2 years ago
pipeline [pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508) 2 years ago
registry Remove duplication registry (#1078) 3 years ago
tensor [autoparallel] change the merge node logic (#1533) 2 years ago
testing [test] skip tests when not enough GPUs are detected (#1090) 3 years ago
trainer fix issue #1080 (#1071) 3 years ago
utils [utils] refactor parallel layers checkpoint and bcast model on loading checkpoint (#1548) 2 years ago
zero [utils] Impl clip_grad_norm for ColoTensor and ZeroOptimizer (#1442) 2 years ago
__init__.py [fx] support meta tracing for aten level computation graphs like functorch. (#1536) 2 years ago
_meta_registrations.py [fx] support meta tracing for aten level computation graphs like functorch. (#1536) 2 years ago
constants.py fix typo in constants (#1027) 3 years ago
core.py [Tensor] distributed view supports inter-process hybrid parallel (#1169) 2 years ago
global_variables.py [MOE] add unitest for MOE experts layout, gradient handler and kernel (#469) 3 years ago
initialize.py [hotfix] remove potiential circle import (#1307) 2 years ago