21 Commits (81ea66d25d9dc10fcd4d7331e7a2274e849f0909)

Author SHA1 Message Date
Kirigaya Kazuto e9460b45c8
[engin/schedule] use p2p_v2 to recontruct pipeline_schedule (#1408) 2 years ago
Jiarui Fang ff644ee5e4
polish unitest test with titans (#1152) 2 years ago
Frank Lee 53297330c0
[test] fixed hybrid parallel test case on 8 GPUs (#1106) 2 years ago
Frank Lee 2b2dc1c86b
[pipeline] refactor the pipeline module (#1087) 2 years ago
Frank Lee 50ec3a7e06
[test] skip tests when not enough GPUs are detected (#1090) 2 years ago
Frank Lee 65ee6dcc20
[test] ignore 8 gpu test (#1080) 2 years ago
YuliangLiu0306 9feff0f760
[titans]remove model zoo (#1042) 3 years ago
Jiarui Fang 681addb512
[refactor] moving grad acc logic to engine (#804) 3 years ago
Frank Lee 5a1a095b92
[test] refactored with the new rerun decorator (#763) 3 years ago
YuliangLiu0306 ade05a5d83
[refactor] pipeline, put runtime schedule into engine. (#627) 3 years ago
Frank Lee 3601b2bad0
[test] fixed rerun_on_exception and adapted test cases (#487) 3 years ago
Frank Lee bb2790cf0b
optimize engine and trainer test (#448) 3 years ago
Jiarui Fang 496cbb0760
[hotfix] fix initialize bug with zero (#442) 3 years ago
ver217 96780e6ee4
Optimize pipeline schedule (#94) 3 years ago
アマデウス 01a80cd86d
Hotfix/Colossalai layers (#92) 3 years ago
アマデウス 0fedef4f3c
Layer integration (#83) 3 years ago
ver217 8f02a88db2
add interleaved pipeline, fix naive amp and update pipeline model initializer (#80) 3 years ago
Frank Lee cd9c28e055
added CI for unit testing (#69) 3 years ago
Frank Lee da01c234e1
Develop/experiments (#59) 3 years ago
Frank Lee 3defa32aee
Support TP-compatible Torch AMP and Update trainer API (#27) 3 years ago
zbian 404ecbdcc6 Migrated project 3 years ago