Hongxin Liu
7f8b16635b
[misc] refactor launch API and tensor constructor ( #5666 )
...
* [misc] remove config arg from initialize
* [misc] remove old tensor contrusctor
* [plugin] add npu support for ddp
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [devops] fix doc test ci
* [test] fix test launch
* [doc] update launch doc
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
7 months ago
Hongxin Liu
d202cc28c0
[npu] change device to accelerator api ( #5239 )
...
* update accelerator
* fix timer
* fix amp
* update
* fix
* update bug
* add error raise
* fix autocast
* fix set device
* remove doc accelerator
* update doc
* update doc
* update doc
* use nullcontext
* update cpu
* update null context
* change time limit for example
* udpate
* update
* update
* update
* [npu] polish accelerator code
---------
Co-authored-by: Xuanlei Zhao <xuanlei.zhao@gmail.com>
Co-authored-by: zxl <43881818+oahzxl@users.noreply.github.com>
11 months ago
binmakeswell
822051d888
[doc] update slack link ( #4823 )
1 year ago
Hongxin Liu
079bf3cb26
[misc] update pre-commit and run all files ( #4752 )
...
* [misc] update pre-commit
* [misc] run pre-commit
* [misc] remove useless configuration files
* [misc] ignore cuda for clang-format
1 year ago
Hongxin Liu
b5f9e37c70
[legacy] clean up legacy code ( #4743 )
...
* [legacy] remove outdated codes of pipeline (#4692 )
* [legacy] remove cli of benchmark and update optim (#4690 )
* [legacy] remove cli of benchmark and update optim
* [doc] fix cli doc test
* [legacy] fix engine clip grad norm
* [legacy] remove outdated colo tensor (#4694 )
* [legacy] remove outdated colo tensor
* [test] fix test import
* [legacy] move outdated zero to legacy (#4696 )
* [legacy] clean up utils (#4700 )
* [legacy] clean up utils
* [example] update examples
* [legacy] clean up amp
* [legacy] fix amp module
* [legacy] clean up gpc (#4742 )
* [legacy] clean up context
* [legacy] clean core, constants and global vars
* [legacy] refactor initialize
* [example] fix examples ci
* [example] fix examples ci
* [legacy] fix tests
* [example] fix gpt example
* [example] fix examples ci
* [devops] fix ci installation
* [example] fix examples ci
1 year ago
Hongxin Liu
27061426f7
[gemini] improve compatibility and add static placement policy ( #4479 )
...
* [gemini] remove distributed-related part from colotensor (#4379 )
* [gemini] remove process group dependency
* [gemini] remove tp part from colo tensor
* [gemini] patch inplace op
* [gemini] fix param op hook and update tests
* [test] remove useless tests
* [test] remove useless tests
* [misc] fix requirements
* [test] fix model zoo
* [test] fix model zoo
* [test] fix model zoo
* [test] fix model zoo
* [test] fix model zoo
* [misc] update requirements
* [gemini] refactor gemini optimizer and gemini ddp (#4398 )
* [gemini] update optimizer interface
* [gemini] renaming gemini optimizer
* [gemini] refactor gemini ddp class
* [example] update gemini related example
* [example] update gemini related example
* [plugin] fix gemini plugin args
* [test] update gemini ckpt tests
* [gemini] fix checkpoint io
* [example] fix opt example requirements
* [example] fix opt example
* [example] fix opt example
* [example] fix opt example
* [gemini] add static placement policy (#4443 )
* [gemini] add static placement policy
* [gemini] fix param offload
* [test] update gemini tests
* [plugin] update gemini plugin
* [plugin] update gemini plugin docstr
* [misc] fix flash attn requirement
* [test] fix gemini checkpoint io test
* [example] update resnet example result (#4457 )
* [example] update bert example result (#4458 )
* [doc] update gemini doc (#4468 )
* [example] update gemini related examples (#4473 )
* [example] update gpt example
* [example] update dreambooth example
* [example] update vit
* [example] update opt
* [example] update palm
* [example] update vit and opt benchmark
* [hotfix] fix bert in model zoo (#4480 )
* [hotfix] fix bert in model zoo
* [test] remove chatglm gemini test
* [test] remove sam gemini test
* [test] remove vit gemini test
* [hotfix] fix opt tutorial example (#4497 )
* [hotfix] fix opt tutorial example
* [hotfix] fix opt tutorial example
1 year ago
digger yu
33eef714db
fix typo examples and docs ( #3932 )
1 year ago
Maruyama_Aya
9b5e7ce21f
modify shell for check
1 year ago
Maruyama_Aya
730a092ba2
modify shell for check
1 year ago
Maruyama_Aya
49567d56d1
modify shell for check
1 year ago
Maruyama_Aya
039854b391
modify shell for check
1 year ago
Maruyama_Aya
cf4792c975
modify shell for check
1 year ago
Maruyama_Aya
c94a33579b
modify shell for check
1 year ago
Maruyama_Aya
4fc8bc68ac
modify file path
1 year ago
Maruyama_Aya
b4437e88c3
fixed port
1 year ago
Maruyama_Aya
79c9f776a9
fixed port
1 year ago
Maruyama_Aya
d3379f0be7
fixed model saving bugs
1 year ago
Maruyama_Aya
b29e1f0722
change directory
1 year ago
digger yu
518b31c059
[docs] change placememt_policy to placement_policy ( #3829 )
...
* fix typo colossalai/autochunk auto_parallel amp
* fix typo colossalai/auto_parallel nn utils etc.
* fix typo colossalai/auto_parallel autochunk fx/passes etc.
* fix typo docs/
* change placememt_policy to placement_policy in docs/ and examples/
2 years ago
digger-yu
b9a8dff7e5
[doc] Fix typo under colossalai and doc( #3618 )
...
* Fixed several spelling errors under colossalai
* Fix the spelling error in colossalai and docs directory
* Cautious Changed the spelling error under the example folder
* Update runtime_preparation_pass.py
revert autograft to autograd
* Update search_chunk.py
utile to until
* Update check_installation.py
change misteach to mismatch in line 91
* Update 1D_tensor_parallel.md
revert to perceptron
* Update 2D_tensor_parallel.md
revert to perceptron in line 73
* Update 2p5D_tensor_parallel.md
revert to perceptron in line 71
* Update 3D_tensor_parallel.md
revert to perceptron in line 80
* Update README.md
revert to resnet in line 42
* Update reorder_graph.py
revert to indice in line 7
* Update p2p.py
revert to megatron in line 94
* Update initialize.py
revert to torchrun in line 198
* Update routers.py
change to detailed in line 63
* Update routers.py
change to detailed in line 146
* Update README.md
revert random number in line 402
2 years ago
ver217
26b7aac0be
[zero] reorganize zero/gemini folder structure ( #3424 )
...
* [zero] refactor low-level zero folder structure
* [zero] fix legacy zero import path
* [zero] fix legacy zero import path
* [zero] remove useless import
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor legacy zero import path
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor legacy zero import path
* [zero] fix test import path
* [zero] fix test
* [zero] fix circular import
* [zero] update import
2 years ago
NatalieC323
e5f668f280
[dreambooth] fixing the incompatibity in requirements.txt ( #3190 )
...
* Update requirements.txt
* Update environment.yaml
* Update README.md
* Update environment.yaml
* Update README.md
* Update README.md
* Delete requirements_colossalai.txt
* Update requirements.txt
* Update README.md
2 years ago
binmakeswell
3c01280a56
[doc] add community contribution guide ( #3153 )
...
* [doc] update contribution guide
* [doc] update contribution guide
* [doc] add community contribution guide
2 years ago
Haofan Wang
47ecb22387
[example] add LoRA support ( #2821 )
...
* add lora
* format
2 years ago
Fazzie-Maqianli
ba84cd80b2
fix pip install colossal ( #2764 )
2 years ago
Fazzie-Maqianli
292c81ed7c
fix/transformer-verison ( #2581 )
2 years ago
jiaruifang
32390cbe8f
add test_ci.sh to dreambooth
2 years ago
jiaruifang
025b482dc1
[example] dreambooth example
2 years ago
Haofan Wang
cfd1d5ee49
[example] fixed seed error in train_dreambooth_colossalai.py ( #2445 )
2 years ago
jiaruifang
b2e0d502b8
[doc] hotfix #2377
2 years ago
HELSON
48d33b1b17
[gemini] add get static torch model ( #2356 )
2 years ago
binmakeswell
d7352bef2c
[example] add example requirement ( #2345 )
2 years ago
Haofan Wang
7ce965c7cc
Update requirement_colossalai.txt ( #2348 )
2 years ago
Haofan Wang
9edd0aa75e
Update train_dreambooth_colossalai.py
...
accelerator.num_processes -> gpc.get_world_size(ParallelMode.DATA)
2 years ago
Fazzie-Maqianli
89f26331e9
[example] diffusion update diffusion,Dreamblooth ( #2329 )
2 years ago
Fazzie-Maqianli
a9b27b9265
[exmaple] fix dreamblooth format ( #2315 )
2 years ago
BlueRum
1405b4381e
[example] fix save_load bug for dreambooth ( #2280 )
2 years ago
Fazzie-Maqianli
89f048a88a
[example] clear diffuser image ( #2262 )
2 years ago
Fazzie-Maqianli
ce3c4eca7b
[example] support Dreamblooth ( #2188 )
2 years ago