Commit Graph

37 Commits (a04337bfc30a81f3d6bc687d933edb92a124228c)

Author SHA1 Message Date
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
* [misc] update pre-commit

* [misc] run pre-commit

* [misc] remove useless configuration files

* [misc] ignore cuda for clang-format
2023-09-19 14:20:26 +08:00
Hongxin Liu b5f9e37c70
[legacy] clean up legacy code (#4743)
* [legacy] remove outdated codes of pipeline (#4692)

* [legacy] remove cli of benchmark and update optim (#4690)

* [legacy] remove cli of benchmark and update optim

* [doc] fix cli doc test

* [legacy] fix engine clip grad norm

* [legacy] remove outdated colo tensor (#4694)

* [legacy] remove outdated colo tensor

* [test] fix test import

* [legacy] move outdated zero to legacy (#4696)

* [legacy] clean up utils (#4700)

* [legacy] clean up utils

* [example] update examples

* [legacy] clean up amp

* [legacy] fix amp module

* [legacy] clean up gpc (#4742)

* [legacy] clean up context

* [legacy] clean core, constants and global vars

* [legacy] refactor initialize

* [example] fix examples ci

* [example] fix examples ci

* [legacy] fix tests

* [example] fix gpt example

* [example] fix examples ci

* [devops] fix ci installation

* [example] fix examples ci
2023-09-18 16:31:06 +08:00
Hongxin Liu 554aa9592e
[legacy] move communication and nn to legacy and refactor logger (#4671)
* [legacy] move communication to legacy (#4640)

* [legacy] refactor logger and clean up legacy codes (#4654)

* [legacy] make logger independent to gpc

* [legacy] make optim independent to registry

* [legacy] move test engine to legacy

* [legacy] move nn to legacy (#4656)

* [legacy] move nn to legacy

* [checkpointio] fix save hf config

* [test] remove useledd rpc pp test

* [legacy] fix nn init

* [example] skip tutorial hybriad parallel example

* [devops] test doc check

* [devops] test doc check
2023-09-11 16:24:28 +08:00
digger-yu 1f73609adb
[CI] fix typo with tests/ etc. (#3727)
* fix spelling error with examples/comminity/

* fix spelling error with tests/

* fix some spelling error with tests/ colossalai/ etc.

* fix spelling error with tests/ etc. date:2023.5.10
2023-05-11 16:30:58 +08:00
digger-yu b7141c36dd
[CI] fix some spelling errors (#3707)
* fix spelling error with examples/comminity/

* fix spelling error with tests/

* fix some spelling error with tests/ colossalai/ etc.
2023-05-10 17:12:03 +08:00
Hongxin Liu 50793b35f4
[gemini] accelerate inference (#3641)
* [gemini] support don't scatter after inference

* [chat] update colossalai strategy

* [chat] fix opt benchmark

* [chat] update opt benchmark

* [gemini] optimize inference

* [test] add gemini inference test

* [chat] fix unit test ci

* [chat] fix ci

* [chat] fix ci

* [chat] skip checkpoint test
2023-04-26 16:32:40 +08:00
HELSON a3100bd50d
[testing] add beit model for unit testings (#2196)
* [testing] add beit model

* [beit] fix bugs

* [beit] fix bugs

* [testing] fix bugs
2022-12-26 17:35:36 +08:00
Jiarui Fang deee317b0f
[Gemini] test step-tensor mapping using repeated_computed_layers.py (#2127) 2022-12-13 16:34:10 +08:00
Jiarui Fang 40b7d55bf3
[Gemini] add albert in test models. (#2075) 2022-12-05 14:09:34 +08:00
Jiarui Fang 616ed91ecd
[test] bert test in non-distributed way (#2074) 2022-12-05 13:32:16 +08:00
Jiarui Fang 1e885329f4
[test] align model name with the file name. (#2045) 2022-11-30 15:45:26 +08:00
HELSON 17a3c685b0
[zero] fix unit-tests (#2039) 2022-11-30 10:40:31 +08:00
Jiarui Fang eb7742a4bb
[Gemini] more tests for Gemini (#2038)
* [Gemini] more tests for Gemini

* polish code
2022-11-29 17:13:10 +08:00
HELSON 537e181705
[testing] fix testing models (#2036)
* [testing] fix testing models

* roll back
2022-11-29 13:42:06 +08:00
Jiarui Fang 28aa9a4294
[Gemini] more rigorous unit tests for run_fwd_bwd (#2034) 2022-11-29 09:26:06 +08:00
Jiarui Fang 8daf1b4db1
[Gemini] patch for supporting orch.add_ function for ColoTensor (#2003) 2022-11-25 20:06:35 +08:00
Jiarui Fang 2e9cbfca12
[Gemini] add unitests to check gemini correctness (#2015) 2022-11-24 16:51:45 +08:00
Jiarui Fang 3d907faede
[Gemini] add an inline_op_module to common test models and polish unitests. (#2004) 2022-11-23 16:55:54 +08:00
アマデウス e615cfc3a8
[NFC] polish test component gpt code style (#1567) 2022-09-08 16:34:09 +08:00
Frank Lee 65ee6dcc20
[test] ignore 8 gpu test (#1080)
* [test] ignore 8 gpu test

* polish code

* polish workflow

* polish workflow
2022-06-08 23:14:18 +08:00
ver217 8e3d0ad8f1
[unit test] refactor test tensor (#1005)
* polish test_gpt

* update op unit tests

* update test model
2022-05-19 18:57:56 +08:00
Ziyue Jiang 75d221918a
[Tensor] add 1d vocab loss (#918)
* add 1d vocab loss

* polish
2022-05-07 15:49:14 +08:00
Ziyue Jiang dfaff4e243
[Tensor] fix test_model (#916)
* polish test_model

* polish
2022-05-06 18:06:22 +08:00
Ziyue Jiang f593a5637e
[Tensor] add embedding tp1d row (#904) 2022-04-29 14:10:05 +08:00
Jiarui Fang 1190b2c4a4
[tensor] add cross_entrophy_loss (#868) 2022-04-25 16:01:52 +08:00
Frank Lee 1258af71cc
[ci] cache cuda extension (#860) 2022-04-25 10:03:47 +08:00
Ziyue Jiang bcc8655021
[Tensor ] Add 1Drow weight reshard by spec (#854) 2022-04-24 18:30:20 +08:00
Jiarui Fang 62f059251b
[Tensor] init a tp network training unittest (#849) 2022-04-24 16:43:44 +08:00
HELSON 055fbf5be6
[zero] adapt zero for unsharded paramters (Optimizer part) (#601) 2022-04-01 20:10:47 +08:00
HELSON a30e2b4c24
[zero] adapt for no-leaf module in zero (#535)
only process module's own parameters in Zero context

add zero hooks for all modules that contrain parameters

gather parameters only belonging to module itself
2022-03-28 17:42:18 +08:00
Jiarui Fang 370f567e7d
[zero] new interface for ShardedOptimv2 (#406) 2022-03-14 20:48:41 +08:00
Frank Lee 526a318032 [unit test] Refactored test cases with component func (#339)
* refactored test with component func

* fixed bug
2022-03-11 15:50:28 +08:00
ver217 f5f0ad266e fix bert unit test 2022-03-11 15:50:28 +08:00
jiaruifang 4d94cd513e adapting bert unitest interface 2022-03-11 15:50:28 +08:00
jiaruifang 7977422aeb add bert for unitest and sharded model is not able to pass the bert case 2022-03-11 15:50:28 +08:00
Jiarui Fang 11bddb6e55 [zero] update zero context init with the updated test utils (#327) 2022-03-11 15:50:28 +08:00
Frank Lee 6268446b81 [test] refactored testing components (#324) 2022-03-11 15:50:28 +08:00