Making large AI models cheaper, faster and more accessible
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Wang Binluo
0d0a582033
[shardformer] update transformers (#5583)
* flash_attention forward upgrade
* llama_model_forward
* remove useless comment
* update the requirements.txt
* add the transformers version requirements
* remove the LATEST VERSION try
* [shardformer] update bloom model (#5518)
* update bloom model
* remove the version restriction
* [shardformer] update_falcon (#5520)
* [shardformer] update mistral model (#5511)
* [shardformer] update gpt2 (#5502)
* [shardformer] update gptj model (#5503)
* [shardformer] update opt (#5522)
* [shardformer] update t5 model (#5524)
* [shardformer] update whisper model (#5529)
* [shardformer] update vit model (#5530)
* update vit model
* remove the output_hidden_states
* [shardformer] fix llama modeling
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [zero] support multiple (partial) backward passes (#5596)
* [zero] support multiple (partial) backward passes
* [misc] update requirements
* [zero] support multiple (partial) backward passes (#5596)
* [zero] support multiple (partial) backward passes
* [misc] update requirements
* fix conflicts
* [doc] fix ColossalMoE readme (#5599)
* fix readme
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* merge with main
* merge with main
* llama_model_forward
* remove useless comment
* remove the LATEST VERSION try
* [shardformer] update bloom model (#5518)
* update bloom model
* remove the version restriction
* [shardformer] update mistral model (#5511)
* [shardformer] update opt (#5522)
* [shardformer] update whisper model (#5529)
* [shardformer] fix llama modeling
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [hotfix] Fix examples no pad token & auto parallel codegen bug; (#5606)
* fix no pad token bug
* fixed some auto parallel codegen bug, but might not run on torch 2.1
---------
Co-authored-by: Edenzzzz <wtan45@wisc.edu>
* [shardformer] fix pipeline grad ckpt (#5620)
* [shardformer] fix pipeline grad ckpt
* [shardformer] fix whisper (#5628)
* [test] fix llama model test
* fix the opt upgrade (#5634)
* [shardformer] fix attn replacement (#5636)
* [shardformer] update flashattention replacement (#5637)
* update transformers
update transformers
fix
fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [test] fix llama test (#5638)
* [gemini] fix buffer cast (#5639)
* Fix shardformer upgrade (#5640)
* fix llama model
* fix the mistral
* fix the shardformer model
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [shardformer]support pipeline parallelism for mistral. (#5642)
* [shardformer] fix attn replacement (#5636)
* [shardformer] update flashattention replacement (#5637)
* update transformers
update transformers
fix
fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [Feature] Support LLaMA-3 CPT and ST (#5619)
* support LLaMA-3
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Run pre-commit
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [exampe] update llama example (#5626)
* [plugin] support dp inside for hybriad parallel
* [example] update llama benchmark
* [example] update llama benchmark
* [example] update llama readme
* [example] update llama readme
* [example] llama3 (#5631)
* release llama3
* [release] llama3
* [release] llama3
* [release] llama3
* [release] llama3
* [test] fix llama test (#5638)
* [gemini] fix buffer cast (#5639)
* support pp for mistral
* fix
* fix
fix
fix
* fix
---------
Co-authored-by: Hongxin Liu <lhx0217@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
Co-authored-by: binmakeswell <binmakeswell@gmail.com>
---------
Co-authored-by: Hongxin Liu <lhx0217@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>
Co-authored-by: Edenzzzz <wenxuan.tan@wisc.edu>
Co-authored-by: Edenzzzz <wtan45@wisc.edu>
Co-authored-by: flybird11111 <1829166702@qq.com>
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
Co-authored-by: binmakeswell <binmakeswell@gmail.com>
|
7 months ago |
.. |
chunk
|
[feat] refactored extension module (#5298)
|
10 months ago |
memory_tracer
|
[npu] change device to accelerator api (#5239)
|
11 months ago |
__init__.py
|
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
|
1 year ago |
gemini_ddp.py
|
[shardformer] update transformers (#5583)
|
7 months ago |
gemini_hook.py
|
[misc] update pre-commit and run all files (#4752)
|
1 year ago |
gemini_mgr.py
|
[npu] add npu support for gemini and zero (#5067)
|
1 year ago |
gemini_optimizer.py
|
[shardformer] refactor embedding resize (#5603)
|
7 months ago |
placement_policy.py
|
[npu] change device to accelerator api (#5239)
|
11 months ago |
utils.py
|
[npu] change device to accelerator api (#5239)
|
11 months ago |