Hongxin Liu
|
d202cc28c0
|
[npu] change device to accelerator api (#5239)
* update accelerator
* fix timer
* fix amp
* update
* fix
* update bug
* add error raise
* fix autocast
* fix set device
* remove doc accelerator
* update doc
* update doc
* update doc
* use nullcontext
* update cpu
* update null context
* change time limit for example
* udpate
* update
* update
* update
* [npu] polish accelerator code
---------
Co-authored-by: Xuanlei Zhao <xuanlei.zhao@gmail.com>
Co-authored-by: zxl <43881818+oahzxl@users.noreply.github.com>
|
11 months ago |
Hongxin Liu
|
079bf3cb26
|
[misc] update pre-commit and run all files (#4752)
* [misc] update pre-commit
* [misc] run pre-commit
* [misc] remove useless configuration files
* [misc] ignore cuda for clang-format
|
1 year ago |
Hongxin Liu
|
b5f9e37c70
|
[legacy] clean up legacy code (#4743)
* [legacy] remove outdated codes of pipeline (#4692)
* [legacy] remove cli of benchmark and update optim (#4690)
* [legacy] remove cli of benchmark and update optim
* [doc] fix cli doc test
* [legacy] fix engine clip grad norm
* [legacy] remove outdated colo tensor (#4694)
* [legacy] remove outdated colo tensor
* [test] fix test import
* [legacy] move outdated zero to legacy (#4696)
* [legacy] clean up utils (#4700)
* [legacy] clean up utils
* [example] update examples
* [legacy] clean up amp
* [legacy] fix amp module
* [legacy] clean up gpc (#4742)
* [legacy] clean up context
* [legacy] clean core, constants and global vars
* [legacy] refactor initialize
* [example] fix examples ci
* [example] fix examples ci
* [legacy] fix tests
* [example] fix gpt example
* [example] fix examples ci
* [devops] fix ci installation
* [example] fix examples ci
|
1 year ago |
Hongxin Liu
|
27061426f7
|
[gemini] improve compatibility and add static placement policy (#4479)
* [gemini] remove distributed-related part from colotensor (#4379)
* [gemini] remove process group dependency
* [gemini] remove tp part from colo tensor
* [gemini] patch inplace op
* [gemini] fix param op hook and update tests
* [test] remove useless tests
* [test] remove useless tests
* [misc] fix requirements
* [test] fix model zoo
* [test] fix model zoo
* [test] fix model zoo
* [test] fix model zoo
* [test] fix model zoo
* [misc] update requirements
* [gemini] refactor gemini optimizer and gemini ddp (#4398)
* [gemini] update optimizer interface
* [gemini] renaming gemini optimizer
* [gemini] refactor gemini ddp class
* [example] update gemini related example
* [example] update gemini related example
* [plugin] fix gemini plugin args
* [test] update gemini ckpt tests
* [gemini] fix checkpoint io
* [example] fix opt example requirements
* [example] fix opt example
* [example] fix opt example
* [example] fix opt example
* [gemini] add static placement policy (#4443)
* [gemini] add static placement policy
* [gemini] fix param offload
* [test] update gemini tests
* [plugin] update gemini plugin
* [plugin] update gemini plugin docstr
* [misc] fix flash attn requirement
* [test] fix gemini checkpoint io test
* [example] update resnet example result (#4457)
* [example] update bert example result (#4458)
* [doc] update gemini doc (#4468)
* [example] update gemini related examples (#4473)
* [example] update gpt example
* [example] update dreambooth example
* [example] update vit
* [example] update opt
* [example] update palm
* [example] update vit and opt benchmark
* [hotfix] fix bert in model zoo (#4480)
* [hotfix] fix bert in model zoo
* [test] remove chatglm gemini test
* [test] remove sam gemini test
* [test] remove vit gemini test
* [hotfix] fix opt tutorial example (#4497)
* [hotfix] fix opt tutorial example
* [hotfix] fix opt tutorial example
|
1 year ago |
Baizhou Zhang
|
4da324cd60
|
[hotfix]fix argument naming in docs and examples (#4083)
|
1 year ago |
LuGY
|
160c64c645
|
[example] fix bucket size in example of gpt gemini (#4028)
|
1 year ago |
digger yu
|
33eef714db
|
fix typo examples and docs (#3932)
|
1 year ago |
jiangmingyan
|
5f79008c4a
|
[example] update gemini examples (#3868)
* [example]update gemini examples
* [example]update gemini examples
|
2 years ago |
ver217
|
26b7aac0be
|
[zero] reorganize zero/gemini folder structure (#3424)
* [zero] refactor low-level zero folder structure
* [zero] fix legacy zero import path
* [zero] fix legacy zero import path
* [zero] remove useless import
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor legacy zero import path
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor legacy zero import path
* [zero] fix test import path
* [zero] fix test
* [zero] fix circular import
* [zero] update import
|
2 years ago |
HELSON
|
6e0faa70e0
|
[gemini] add profiler in the demo (#2534)
|
2 years ago |
HELSON
|
66dfcf5281
|
[gemini] update the gpt example (#2527)
|
2 years ago |
HELSON
|
707b11d4a0
|
[gemini] update ddp strict mode (#2518)
* [zero] add strict ddp mode for chunk init
* [gemini] update gpt example
|
2 years ago |
HELSON
|
2d1a7dfe5f
|
[zero] add strict ddp mode (#2508)
* [zero] add strict ddp mode
* [polish] add comments for strict ddp mode
* [zero] fix test error
|
2 years ago |
binmakeswell
|
fcc6d61d92
|
[example] fix requirements (#2488)
|
2 years ago |
Jiarui Fang
|
7c31706227
|
[CI] add test_ci.sh for palm, opt and gpt (#2475)
|
2 years ago |
ver217
|
f525d1f528
|
[example] update gpt gemini example ci test (#2477)
|
2 years ago |
Jiarui Fang
|
867c8c2d3a
|
[zero] low level optim supports ProcessGroup (#2464)
|
2 years ago |
HELSON
|
d84e747975
|
[hotfix] add DISTPAN argument for benchmark (#2412)
* change the benchmark config file
* change config
* revert config file
* rename distpan to distplan
|
2 years ago |
HELSON
|
498b5ca993
|
[hotfix] fix gpt gemini example (#2404)
* [hotfix] fix gpt gemini example
* [example] add new assertions
|
2 years ago |
Jiarui Fang
|
1aaeb596c6
|
[example] gpt, shard init on all processes (#2366)
|
2 years ago |
Jiarui Fang
|
509a87f3ff
|
[example] make gpt example directory more clear (#2353)
|
2 years ago |