Commit Graph

82 Commits (c94a33579b7c70d96905ea8b2c3a4baf28451cb0)

Author SHA1 Message Date
jiangmingyan 5f79008c4a
[example] update gemini examples (#3868)
* [example]update gemini examples

* [example]update gemini examples
2023-05-30 18:41:41 +08:00
digger yu 518b31c059
[docs] change placememt_policy to placement_policy (#3829)
* fix typo colossalai/autochunk auto_parallel amp

* fix typo colossalai/auto_parallel nn utils etc.

* fix typo colossalai/auto_parallel autochunk fx/passes  etc.

* fix typo docs/

* change placememt_policy to placement_policy in docs/ and examples/
2023-05-24 14:51:49 +08:00
binmakeswell 15024e40d9
[auto] fix install cmd (#3772) 2023-05-18 13:33:01 +08:00
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618)
* Fixed several spelling errors under colossalai

* Fix the spelling error in colossalai and docs directory

* Cautious Changed the spelling error under the example folder

* Update runtime_preparation_pass.py

revert autograft to autograd

* Update search_chunk.py

utile to until

* Update check_installation.py

change misteach to mismatch in line 91

* Update 1D_tensor_parallel.md

revert to perceptron

* Update 2D_tensor_parallel.md

revert to perceptron in line 73

* Update 2p5D_tensor_parallel.md

revert to perceptron in line 71

* Update 3D_tensor_parallel.md

revert to perceptron in line 80

* Update README.md

revert to resnet in line 42

* Update reorder_graph.py

revert to indice in line 7

* Update p2p.py

revert to megatron in line 94

* Update initialize.py

revert to torchrun in line 198

* Update routers.py

change to detailed in line 63

* Update routers.py

change to detailed in line 146

* Update README.md

revert  random number in line 402
2023-04-26 11:38:43 +08:00
binmakeswell f1b3d60cae
[example] reorganize for community examples (#3557) 2023-04-14 16:27:48 +08:00
mandoxzhang 8f2c55f9c9
[example] remove redundant texts & update roberta (#3493)
* update roberta example

* update roberta example

* modify conflict & update roberta
2023-04-07 11:33:32 +08:00
mandoxzhang ab5fd127e3
[example] update roberta with newer ColossalAI (#3472)
* update roberta example

* update roberta example
2023-04-07 10:34:51 +08:00
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452)
* [test] added spawn decorator

* polish code

* polish code

* polish code

* polish code

* polish code

* polish code
2023-04-06 14:51:35 +08:00
ver217 573af84184
[example] update examples related to zero/gemini (#3431)
* [zero] update legacy import

* [zero] update examples

* [example] fix opt tutorial

* [example] fix opt tutorial

* [example] fix opt tutorial

* [example] fix opt tutorial

* [example] fix import
2023-04-04 17:32:51 +08:00
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424)
* [zero] refactor low-level zero folder structure

* [zero] fix legacy zero import path

* [zero] fix legacy zero import path

* [zero] remove useless import

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] fix test import path

* [zero] fix test

* [zero] fix circular import

* [zero] update import
2023-04-04 13:48:16 +08:00
Yan Fang 189347963a
[auto] fix requirements typo for issue #3125 (#3209) 2023-03-23 10:22:08 +08:00
Zihao 18dbe76cae
[auto-parallel] add auto-offload feature (#3154)
* add auto-offload feature

* polish code

* fix syn offload runtime pass bug

* add offload example

* fix offload testing bug

* fix example testing bug
2023-03-21 14:17:41 +08:00
binmakeswell 360674283d
[example] fix redundant note (#3065) 2023-03-09 10:59:28 +08:00
Tomek af3888481d
[example] fixed opt model downloading from huggingface 2023-03-09 10:47:41 +08:00
ramos 2ef855c798
support shardinit option to avoid OPT OOM initializing problem (#3037)
Co-authored-by: poe <poe@nemoramo>
2023-03-08 13:45:15 +08:00
Ziyue Jiang 400f63012e
[pipeline] Add Simplified Alpa DP Partition (#2507)
* add alpa dp split

* add alpa dp split

* use fwd+bwd instead of fwd only

---------

Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-03-07 10:34:31 +08:00
github-actions[bot] da056285f2
[format] applied code formatting on changed files in pull request 2922 (#2923)
Co-authored-by: github-actions <github-actions@github.com>
2023-02-27 19:29:06 +08:00
binmakeswell 12bafe057f
[doc] update installation for GPT (#2922) 2023-02-27 18:28:34 +08:00
Alex_996 a4fc125c34
Fix typos (#2863)
Fix typos, `6.7 -> 6.7b`
2023-02-22 10:59:48 +08:00
dawei-wang 55424a16a5
[doc] fix GPT tutorial (#2860)
Fix hpcaitech/ColossalAI#2851
2023-02-22 10:58:52 +08:00
Jiarui Fang bf0204604f
[exmaple] add bert and albert (#2824) 2023-02-20 10:35:55 +08:00
cloudhuang 43dffdaba5
[doc] fixed a typo in GPT readme (#2736) 2023-02-15 22:24:45 +08:00
Jiatong (Julius) Han a255a38f7f
[example] Polish README.md (#2658)
* [tutorial] polish readme.md

* [example] Update README.md
2023-02-09 20:43:55 +08:00
HELSON 6e0faa70e0
[gemini] add profiler in the demo (#2534) 2023-01-31 14:21:22 +08:00
HELSON 66dfcf5281
[gemini] update the gpt example (#2527) 2023-01-30 17:58:05 +08:00
HELSON 707b11d4a0
[gemini] update ddp strict mode (#2518)
* [zero] add strict ddp mode for chunk init

* [gemini] update gpt example
2023-01-28 14:35:25 +08:00
HELSON 2d1a7dfe5f
[zero] add strict ddp mode (#2508)
* [zero] add strict ddp mode

* [polish] add comments for strict ddp mode

* [zero] fix test error
2023-01-20 14:04:38 +08:00
Jiarui Fang e327e95144
[hotfix] gpt example titans bug #2493 (#2494) 2023-01-18 12:04:18 +08:00
binmakeswell fcc6d61d92
[example] fix requirements (#2488) 2023-01-17 13:07:25 +08:00
Jiarui Fang 3a21485ead
[example] titans for gpt (#2484) 2023-01-16 15:55:41 +08:00
Jiarui Fang 7c31706227
[CI] add test_ci.sh for palm, opt and gpt (#2475) 2023-01-16 14:44:29 +08:00
ver217 f525d1f528
[example] update gpt gemini example ci test (#2477) 2023-01-13 22:37:31 +08:00
Ziyue Jiang fef5c949c3
polish pp middleware (#2476)
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-13 16:56:01 +08:00
Jiarui Fang 867c8c2d3a
[zero] low level optim supports ProcessGroup (#2464) 2023-01-13 10:05:58 +08:00
YuliangLiu0306 2731531bc2
[autoparallel] integrate device mesh initialization into autoparallelize (#2393)
* [autoparallel] integrate device mesh initialization into autoparallelize

* add megatron solution

* update gpt autoparallel examples with latest api

* adapt beta value to fit the current computation cost
2023-01-11 14:03:49 +08:00
ZijianYY fe0f7970a2
[examples] adding tflops to PaLM (#2365) 2023-01-10 16:18:56 +08:00
HELSON d84e747975
[hotfix] add DISTPAN argument for benchmark (#2412)
* change the benchmark config file

* change config

* revert config file

* rename distpan to distplan
2023-01-10 11:39:25 +08:00
HELSON 498b5ca993
[hotfix] fix gpt gemini example (#2404)
* [hotfix] fix gpt gemini example

* [example] add new assertions
2023-01-09 15:52:17 +08:00
Jiarui Fang 12c8bf38d7
[Pipeline] Refine GPT PP Example 2023-01-06 18:03:45 +08:00
Ziyue Jiang ad00894f7f polish 2023-01-06 16:03:16 +08:00
Jiarui Fang 1aaeb596c6
[example] gpt, shard init on all processes (#2366) 2023-01-06 15:44:50 +08:00
Ziyue Jiang 3a15b20421 Move GPT PP Example 2023-01-06 14:48:58 +08:00
YuliangLiu0306 8b1e0dfd80
[example] upload auto parallel gpt2 demo (#2354) 2023-01-06 11:38:38 +08:00
Jiarui Fang 00a9c781fd
[example] add google doc for benchmark results of GPT (#2355) 2023-01-06 11:38:15 +08:00
Jiarui Fang 509a87f3ff
[example] make gpt example directory more clear (#2353) 2023-01-06 11:11:26 +08:00
Ikko Eltociear Ashimine 5e4bced0a3
[NFC] Update roberta/README.md (#2350) 2023-01-06 10:09:14 +08:00
Jiarui Fang 35e22be2f6
[example] simplify opt example (#2344) 2023-01-06 10:08:41 +08:00
ziyuhuang123 7080a8edb0
[workflow]New version: Create workflow files for examples' auto check (#2298)
* [workflows]bug_repair

* [workflow]new_pr_fixing_bugs

Co-authored-by: binmakeswell <binmakeswell@gmail.com>
2023-01-06 09:26:49 +08:00
binmakeswell d7352bef2c
[example] add example requirement (#2345) 2023-01-06 09:03:29 +08:00
ZijianYY f7fd592bf4
[examples]adding tp to PaLM (#2319) 2023-01-05 17:57:50 +08:00