Commit Graph

36 Commits (d8ceeac14e54c5c568e916c061b86d9a53a54f30)

Author SHA1 Message Date
Hongxin Liu 27061426f7
[gemini] improve compatibility and add static placement policy (#4479)
* [gemini] remove distributed-related part from colotensor (#4379)

* [gemini] remove process group dependency

* [gemini] remove tp part from colo tensor

* [gemini] patch inplace op

* [gemini] fix param op hook and update tests

* [test] remove useless tests

* [test] remove useless tests

* [misc] fix requirements

* [test] fix model zoo

* [test] fix model zoo

* [test] fix model zoo

* [test] fix model zoo

* [test] fix model zoo

* [misc] update requirements

* [gemini] refactor gemini optimizer and gemini ddp (#4398)

* [gemini] update optimizer interface

* [gemini] renaming gemini optimizer

* [gemini] refactor gemini ddp class

* [example] update gemini related example

* [example] update gemini related example

* [plugin] fix gemini plugin args

* [test] update gemini ckpt tests

* [gemini] fix checkpoint io

* [example] fix opt example requirements

* [example] fix opt example

* [example] fix opt example

* [example] fix opt example

* [gemini] add static placement policy (#4443)

* [gemini] add static placement policy

* [gemini] fix param offload

* [test] update gemini tests

* [plugin] update gemini plugin

* [plugin] update gemini plugin docstr

* [misc] fix flash attn requirement

* [test] fix gemini checkpoint io test

* [example] update resnet example result (#4457)

* [example] update bert example result (#4458)

* [doc] update gemini doc (#4468)

* [example] update gemini related examples (#4473)

* [example] update gpt example

* [example] update dreambooth example

* [example] update vit

* [example] update opt

* [example] update palm

* [example] update vit and opt benchmark

* [hotfix] fix bert in model zoo (#4480)

* [hotfix] fix bert in model zoo

* [test] remove chatglm gemini test

* [test] remove sam gemini test

* [test] remove vit gemini test

* [hotfix] fix opt tutorial example (#4497)

* [hotfix] fix opt tutorial example

* [hotfix] fix opt tutorial example
2023-08-24 09:29:25 +08:00
Liu Ziming e277534a18
Merge pull request #3905 from MaruyamaAya/dreambooth
[example] Adding an example of training dreambooth with the new booster API
2023-06-09 08:44:18 +08:00
digger yu 33eef714db
fix typo examples and docs (#3932) 2023-06-08 16:09:32 +08:00
Maruyama_Aya 9b5e7ce21f modify shell for check 2023-06-08 14:56:56 +08:00
Maruyama_Aya 730a092ba2 modify shell for check 2023-06-08 13:38:18 +08:00
Maruyama_Aya 49567d56d1 modify shell for check 2023-06-08 13:36:05 +08:00
Maruyama_Aya 039854b391 modify shell for check 2023-06-08 13:17:58 +08:00
Maruyama_Aya cf4792c975 modify shell for check 2023-06-08 11:15:10 +08:00
Maruyama_Aya c94a33579b modify shell for check 2023-06-07 17:23:01 +08:00
Maruyama_Aya 4fc8bc68ac modify file path 2023-06-07 11:02:19 +08:00
Maruyama_Aya b4437e88c3 fixed port 2023-06-06 16:21:38 +08:00
Maruyama_Aya 79c9f776a9 fixed port 2023-06-06 16:20:45 +08:00
Maruyama_Aya d3379f0be7 fixed model saving bugs 2023-06-06 16:07:34 +08:00
Maruyama_Aya b29e1f0722 change directory 2023-06-06 15:50:03 +08:00
digger yu 518b31c059
[docs] change placememt_policy to placement_policy (#3829)
* fix typo colossalai/autochunk auto_parallel amp

* fix typo colossalai/auto_parallel nn utils etc.

* fix typo colossalai/auto_parallel autochunk fx/passes  etc.

* fix typo docs/

* change placememt_policy to placement_policy in docs/ and examples/
2023-05-24 14:51:49 +08:00
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618)
* Fixed several spelling errors under colossalai

* Fix the spelling error in colossalai and docs directory

* Cautious Changed the spelling error under the example folder

* Update runtime_preparation_pass.py

revert autograft to autograd

* Update search_chunk.py

utile to until

* Update check_installation.py

change misteach to mismatch in line 91

* Update 1D_tensor_parallel.md

revert to perceptron

* Update 2D_tensor_parallel.md

revert to perceptron in line 73

* Update 2p5D_tensor_parallel.md

revert to perceptron in line 71

* Update 3D_tensor_parallel.md

revert to perceptron in line 80

* Update README.md

revert to resnet in line 42

* Update reorder_graph.py

revert to indice in line 7

* Update p2p.py

revert to megatron in line 94

* Update initialize.py

revert to torchrun in line 198

* Update routers.py

change to detailed in line 63

* Update routers.py

change to detailed in line 146

* Update README.md

revert  random number in line 402
2023-04-26 11:38:43 +08:00
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424)
* [zero] refactor low-level zero folder structure

* [zero] fix legacy zero import path

* [zero] fix legacy zero import path

* [zero] remove useless import

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] fix test import path

* [zero] fix test

* [zero] fix circular import

* [zero] update import
2023-04-04 13:48:16 +08:00
NatalieC323 e5f668f280
[dreambooth] fixing the incompatibity in requirements.txt (#3190)
* Update requirements.txt

* Update environment.yaml

* Update README.md

* Update environment.yaml

* Update README.md

* Update README.md

* Delete requirements_colossalai.txt

* Update requirements.txt

* Update README.md
2023-03-21 16:01:13 +08:00
binmakeswell 3c01280a56
[doc] add community contribution guide (#3153)
* [doc] update contribution guide

* [doc] update contribution guide

* [doc] add community contribution guide
2023-03-17 11:07:24 +08:00
Haofan Wang 47ecb22387
[example] add LoRA support (#2821)
* add lora

* format
2023-02-20 16:23:12 +08:00
Fazzie-Maqianli ba84cd80b2
fix pip install colossal (#2764) 2023-02-17 09:54:21 +08:00
Fazzie-Maqianli 292c81ed7c
fix/transformer-verison (#2581) 2023-02-08 13:50:27 +08:00
jiaruifang 32390cbe8f add test_ci.sh to dreambooth 2023-01-19 09:46:28 +08:00
jiaruifang 025b482dc1 [example] dreambooth example 2023-01-18 18:42:56 +08:00
Haofan Wang cfd1d5ee49
[example] fixed seed error in train_dreambooth_colossalai.py (#2445) 2023-01-11 16:56:15 +08:00
jiaruifang b2e0d502b8 [doc] hotfix #2377 2023-01-07 19:44:50 +08:00
HELSON 48d33b1b17
[gemini] add get static torch model (#2356) 2023-01-06 13:41:19 +08:00
Fazzie-Maqianli 7a332b1734
Merge pull request #2338 from haofanwang/patch-1
Fix a typo in train_dreambooth_colossalai.py
2023-01-06 11:50:18 +08:00
binmakeswell d7352bef2c
[example] add example requirement (#2345) 2023-01-06 09:03:29 +08:00
Haofan Wang 7ce965c7cc
Update requirement_colossalai.txt (#2348) 2023-01-05 21:16:42 +08:00
Haofan Wang 9edd0aa75e
Update train_dreambooth_colossalai.py
accelerator.num_processes -> gpc.get_world_size(ParallelMode.DATA)
2023-01-05 15:49:57 +08:00
Fazzie-Maqianli 89f26331e9
[example] diffusion update diffusion,Dreamblooth (#2329) 2023-01-05 11:23:26 +08:00
Fazzie-Maqianli a9b27b9265
[exmaple] fix dreamblooth format (#2315) 2023-01-04 16:20:00 +08:00
BlueRum 1405b4381e
[example] fix save_load bug for dreambooth (#2280) 2023-01-03 17:13:29 +08:00
Fazzie-Maqianli 89f048a88a
[example] clear diffuser image (#2262) 2023-01-03 10:57:02 +08:00
Fazzie-Maqianli ce3c4eca7b
[example] support Dreamblooth (#2188) 2022-12-23 16:47:30 +08:00