Commit Graph

3 Commits (ccdaf8ec45ad45dffb23e37a4cdf89d3b4842469)

Author SHA1 Message Date
zhanglei ccdaf8ec45 fix the moe_loss for ci and val 2023-09-22 15:45:36 +08:00
huangting4201 1ed36754df
feat(.github/workflows): update ci e2e tests and add ci unit tests (#324)
* feat(.github/workflows/e2e_test.yaml): update e2e yaml

* feat(.github/workflows/e2e_test.yaml): update e2e yaml

* test e2e

* test e2e

* test e2e

* test e2e

* test e2e

* fix(ci): test ci

* fix(ci): test ci

* fix(ci): test ci

* fix(ci): test ci

* fix(ci): test ci

* fix(ci): add weekly tests

---------

Co-authored-by: huangting4201 <huangting3@sensetime.com>
2023-09-22 14:07:14 +08:00
huangting4201 025ca55dfe
test(tests/test_training): add training e2e tests for loss spike and loss accuracy (#304)
* tests(test_training): add test case for loss accuracy

* tests(test_training): update test cases

* ci(.github/workflows/e2e_test.yaml): remove pull submodule

* ci(.github/workflows/e2e_test.yaml): update ci env and remove useless env var

* test(tests/test_training): add 16 GPUs test cases

* test(tests/test_training): fix training_16GPU_8DP2PP test case error

* test(tests/test_training): add new case for interleaved pp

* test(tests/test_training): remove redundant code

* test(tests/test_training): update ci job timeout minutes to 30m

* feat(initialize/launch.py): check num_chunks and interleaved_overlap

---------

Co-authored-by: huangting4201 <huangting3@sensetime.com>
2023-09-19 14:55:40 +08:00