Commit Graph

20 Commits (d8ceeac14e54c5c568e916c061b86d9a53a54f30)

Author SHA1 Message Date
flybird11111 7486ed7d3a
[shardformer] update llama2/opt finetune example and fix llama2 policy (#4645)
* [shardformer] update shardformer readme

[shardformer] update shardformer readme

[shardformer] update shardformer readme

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] change dataset

* [shardformer] change dataset

* [shardformer] fix CI

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

[example] update opt example

[example] resolve comments

fix

fix
2023-09-09 22:45:36 +08:00
Hongxin Liu 27061426f7
[gemini] improve compatibility and add static placement policy (#4479)
* [gemini] remove distributed-related part from colotensor (#4379)

* [gemini] remove process group dependency

* [gemini] remove tp part from colo tensor

* [gemini] patch inplace op

* [gemini] fix param op hook and update tests

* [test] remove useless tests

* [test] remove useless tests

* [misc] fix requirements

* [test] fix model zoo

* [test] fix model zoo

* [test] fix model zoo

* [test] fix model zoo

* [test] fix model zoo

* [misc] update requirements

* [gemini] refactor gemini optimizer and gemini ddp (#4398)

* [gemini] update optimizer interface

* [gemini] renaming gemini optimizer

* [gemini] refactor gemini ddp class

* [example] update gemini related example

* [example] update gemini related example

* [plugin] fix gemini plugin args

* [test] update gemini ckpt tests

* [gemini] fix checkpoint io

* [example] fix opt example requirements

* [example] fix opt example

* [example] fix opt example

* [example] fix opt example

* [gemini] add static placement policy (#4443)

* [gemini] add static placement policy

* [gemini] fix param offload

* [test] update gemini tests

* [plugin] update gemini plugin

* [plugin] update gemini plugin docstr

* [misc] fix flash attn requirement

* [test] fix gemini checkpoint io test

* [example] update resnet example result (#4457)

* [example] update bert example result (#4458)

* [doc] update gemini doc (#4468)

* [example] update gemini related examples (#4473)

* [example] update gpt example

* [example] update dreambooth example

* [example] update vit

* [example] update opt

* [example] update palm

* [example] update vit and opt benchmark

* [hotfix] fix bert in model zoo (#4480)

* [hotfix] fix bert in model zoo

* [test] remove chatglm gemini test

* [test] remove sam gemini test

* [test] remove vit gemini test

* [hotfix] fix opt tutorial example (#4497)

* [hotfix] fix opt tutorial example

* [hotfix] fix opt tutorial example
2023-08-24 09:29:25 +08:00
Baizhou Zhang b3ab7fbabf
[example] update ViT example using booster api (#3940) 2023-06-12 15:02:27 +08:00
digger yu 33eef714db
fix typo examples and docs (#3932) 2023-06-08 16:09:32 +08:00
Baizhou Zhang e417dd004e
[example] update opt example using booster api (#3918) 2023-06-08 11:27:05 +08:00
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618)
* Fixed several spelling errors under colossalai

* Fix the spelling error in colossalai and docs directory

* Cautious Changed the spelling error under the example folder

* Update runtime_preparation_pass.py

revert autograft to autograd

* Update search_chunk.py

utile to until

* Update check_installation.py

change misteach to mismatch in line 91

* Update 1D_tensor_parallel.md

revert to perceptron

* Update 2D_tensor_parallel.md

revert to perceptron in line 73

* Update 2p5D_tensor_parallel.md

revert to perceptron in line 71

* Update 3D_tensor_parallel.md

revert to perceptron in line 80

* Update README.md

revert to resnet in line 42

* Update reorder_graph.py

revert to indice in line 7

* Update p2p.py

revert to megatron in line 94

* Update initialize.py

revert to torchrun in line 198

* Update routers.py

change to detailed in line 63

* Update routers.py

change to detailed in line 146

* Update README.md

revert  random number in line 402
2023-04-26 11:38:43 +08:00
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424)
* [zero] refactor low-level zero folder structure

* [zero] fix legacy zero import path

* [zero] fix legacy zero import path

* [zero] remove useless import

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] fix test import path

* [zero] fix test

* [zero] fix circular import

* [zero] update import
2023-04-04 13:48:16 +08:00
binmakeswell 360674283d
[example] fix redundant note (#3065) 2023-03-09 10:59:28 +08:00
Tomek af3888481d
[example] fixed opt model downloading from huggingface 2023-03-09 10:47:41 +08:00
ramos 2ef855c798
support shardinit option to avoid OPT OOM initializing problem (#3037)
Co-authored-by: poe <poe@nemoramo>
2023-03-08 13:45:15 +08:00
Alex_996 a4fc125c34
Fix typos (#2863)
Fix typos, `6.7 -> 6.7b`
2023-02-22 10:59:48 +08:00
binmakeswell fcc6d61d92
[example] fix requirements (#2488) 2023-01-17 13:07:25 +08:00
Jiarui Fang 7c31706227
[CI] add test_ci.sh for palm, opt and gpt (#2475) 2023-01-16 14:44:29 +08:00
Jiarui Fang 35e22be2f6
[example] simplify opt example (#2344) 2023-01-06 10:08:41 +08:00
Jiarui Fang f7e276fa71
[Gemini] add GeminiAdamOptimizer (#1960) 2022-11-16 14:44:28 +08:00
Jiarui Fang a25f755331
[example] add TP to GPT example (#1828) 2022-11-08 17:17:19 +08:00
Jiarui Fang b1263d32ba
[example] simplify the GPT2 huggingface example (#1826) 2022-11-08 16:14:07 +08:00
Jiarui Fang cd5a0d56fa
[Gemini] make gemini usage simple (#1821) 2022-11-08 15:53:13 +08:00
Jiarui Fang 350ccc0481
[example] opt does not depend on Titans (#1811) 2022-11-08 12:02:20 +08:00
Jiarui Fang fd2c8d8156
[example] add opt model in lauguage (#1809) 2022-11-08 10:39:13 +08:00