flybird11111
cabc1286ca
[LowLevelZero] low level zero support lora ( #5153 )
...
* low level zero support lora
low level zero support lora
* add checkpoint test
* add checkpoint test
* fix
* fix
* fix
* fix
fix
fix
fix
* fix
* fix
fix
fix
fix
fix
fix
fix
* fix
* fix
fix
fix
fix
fix
fix
fix
* fix
* test ci
* git # This is a combination of 3 commits.
Update low_level_zero_plugin.py
Update low_level_zero_plugin.py
fix
fix
fix
* fix naming
fix naming
fix naming
fix
11 months ago
Baizhou Zhang
c5fd4aa6e8
[lora] add lora APIs for booster, support lora for TorchDDP ( #4981 )
...
* add apis and peft requirement
* add liscense and implement apis
* add checkpointio apis
* add torchddp fwd_bwd test
* add support_lora methods
* add checkpointio test and debug
* delete unneeded codes
* remove peft from LICENSE
* add concrete methods for enable_lora
* simplify enable_lora api
* fix requirements
1 year ago
Xu Kai
785802e809
[inference] add reference and fix some bugs ( #4937 )
...
* add reference and fix some bugs
* update gptq init
---------
Co-authored-by: Xu Kai <xukai16@foxamil.com>
1 year ago
Hongxin Liu
b8e770c832
[test] merge old components to test to model zoo ( #4945 )
...
* [test] add custom models in model zoo
* [test] update legacy test
* [test] update model zoo
* [test] update gemini test
* [test] remove components to test
1 year ago
Cuiqing Li
3a41e8304e
[Refactor] Integrated some lightllm kernels into token-attention ( #4946 )
...
* add some req for inference
* clean codes
* add codes
* add some lightllm deps
* clean codes
* hello
* delete rms files
* add some comments
* add comments
* add doc
* add lightllm deps
* add lightllm cahtglm2 kernels
* add lightllm cahtglm2 kernels
* replace rotary embedding with lightllm kernel
* add some commnets
* add some comments
* add some comments
* add
* replace fwd kernel att1
* fix a arg
* add
* add
* fix token attention
* add some comments
* clean codes
* modify comments
* fix readme
* fix bug
* fix bug
---------
Co-authored-by: cuiqing.li <lixx336@gmail.com>
Co-authored-by: CjhHa1 <cjh18671720497@outlook.com>
1 year ago
digger yu
11009103be
[nfc] fix some typo with colossalai/ docs/ etc. ( #4920 )
1 year ago
github-actions[bot]
486d06a2d5
[format] applied code formatting on changed files in pull request 4820 ( #4886 )
...
Co-authored-by: github-actions <github-actions@github.com>
1 year ago
Zhongkai Zhao
c7aa319ba0
[test] add no master test for low level zero plugin ( #4934 )
1 year ago
Hongxin Liu
1f5d2e8062
[hotfix] fix torch 2.0 compatibility ( #4936 )
...
* [hotfix] fix launch
* [test] fix test gemini optim
* [shardformer] fix vit
1 year ago
Baizhou Zhang
21ba89cab6
[gemini] support gradient accumulation ( #4869 )
...
* add test
* fix no_sync bug in low level zero plugin
* fix test
* add argument for grad accum
* add grad accum in backward hook for gemini
* finish implementation, rewrite tests
* fix test
* skip stuck model in low level zero test
* update doc
* optimize communication & fix gradient checkpoint
* modify doc
* cleaning codes
* update cpu adam fp16 case
1 year ago
github-actions[bot]
a41cf88e9b
[format] applied code formatting on changed files in pull request 4908 ( #4918 )
...
Co-authored-by: github-actions <github-actions@github.com>
1 year ago
Hongxin Liu
4f68b3f10c
[kernel] support pure fp16 for cpu adam and update gemini optim tests ( #4921 )
...
* [kernel] support pure fp16 for cpu adam (#4896 )
* [kernel] fix cpu adam kernel for pure fp16 and update tests (#4919 )
* [kernel] fix cpu adam
* [test] update gemini optim test
1 year ago
Zian(Andy) Zheng
7768afbad0
Update flash_attention_patch.py
...
To be compatible with the new change in the Transformers library, where a new argument 'padding_mask' was added to forward function of attention layer.
https://github.com/huggingface/transformers/pull/25598
1 year ago
Xu Kai
611a5a80ca
[inference] Add smmoothquant for llama ( #4904 )
...
* [inference] add int8 rotary embedding kernel for smoothquant (#4843 )
* [inference] add smoothquant llama attention (#4850 )
* add smoothquant llama attention
* remove uselss code
* remove useless code
* fix import error
* rename file name
* [inference] add silu linear fusion for smoothquant llama mlp (#4853 )
* add silu linear
* update skip condition
* catch smoothquant cuda lib exception
* prcocess exception for tests
* [inference] add llama mlp for smoothquant (#4854 )
* add llama mlp for smoothquant
* fix down out scale
* remove duplicate lines
* add llama mlp check
* delete useless code
* [inference] add smoothquant llama (#4861 )
* add smoothquant llama
* fix attention accuracy
* fix accuracy
* add kv cache and save pretrained
* refactor example
* delete smooth
* refactor code
* [inference] add smooth function and delete useless code for smoothquant (#4895 )
* add smooth function and delete useless code
* update datasets
* remove duplicate import
* delete useless file
* refactor codes (#4902 )
* rafactor code
* add license
* add torch-int and smoothquant license
1 year ago
Zhongkai Zhao
a0684e7bd6
[feature] support no master weights option for low level zero plugin ( #4816 )
...
* [feature] support no master weights for low level zero plugin
* [feature] support no master weights for low level zero plugin, remove data copy when no master weights
* remove data copy and typecasting when no master weights
* not load weights to cpu when using no master weights
* fix grad: use fp16 grad when no master weights
* only do not update working param when no master weights
* fix: only do not update working param when no master weights
* fix: passing params in dict format in hybrid plugin
* fix: remove extra params (tp_process_group) in hybrid_parallel_plugin
1 year ago
Xu Kai
77a9328304
[inference] add llama2 support ( #4898 )
...
* add llama2 support
* fix multi group bug
1 year ago
Baizhou Zhang
39f2582e98
[hotfix] fix lr scheduler bug in torch 2.0 ( #4864 )
1 year ago
littsk
83b52c56cd
[feature] Add clip_grad_norm for hybrid_parallel_plugin ( #4837 )
...
* Add clip_grad_norm for hibrid_parallel_plugin
* polish code
* add unittests
* Move tp to a higher-level optimizer interface.
* bug fix
* polish code
1 year ago
Hongxin Liu
df63564184
[gemini] support amp o3 for gemini ( #4872 )
...
* [gemini] support no reuse fp16 chunk
* [gemini] support no master weight for optim
* [gemini] support no master weight for gemini ddp
* [test] update gemini tests
* [test] update gemini tests
* [plugin] update gemini plugin
* [test] fix gemini checkpointio test
* [test] fix gemini checkpoint io
1 year ago
ppt0011
c1fab951e7
Merge pull request #4889 from ppt0011/main
...
[doc] add reminder for issue encountered with hybrid adam
1 year ago
littsk
ffd9a3cbc9
[hotfix] fix bug in sequence parallel test ( #4887 )
1 year ago
ppt0011
1dcaf249bd
[doc] add reminder for issue encountered with hybrid adam
1 year ago
Xu Kai
fdec650bb4
fix test llama ( #4884 )
1 year ago
Bin Jia
08a9f76b2f
[Pipeline Inference] Sync pipeline inference branch to main ( #4820 )
...
* [pipeline inference] pipeline inference (#4492 )
* add pp stage manager as circle stage
* fix a bug when create process group
* add ppinfer basic framework
* add micro batch manager and support kvcache-pp gpt2 fwd
* add generate schedule
* use mb size to control mb number
* support generate with kv cache
* add output, remove unused code
* add test
* reuse shardformer to build model
* refactor some code and use the same attribute name of hf
* fix review and add test for generation
* remove unused file
* fix CI
* add cache clear
* fix code error
* fix typo
* [Pipeline inference] Modify to tieweight (#4599 )
* add pp stage manager as circle stage
* fix a bug when create process group
* add ppinfer basic framework
* add micro batch manager and support kvcache-pp gpt2 fwd
* add generate schedule
* use mb size to control mb number
* support generate with kv cache
* add output, remove unused code
* add test
* reuse shardformer to build model
* refactor some code and use the same attribute name of hf
* fix review and add test for generation
* remove unused file
* modify the way of saving newtokens
* modify to tieweight
* modify test
* remove unused file
* solve review
* add docstring
* [Pipeline inference] support llama pipeline inference (#4647 )
* support llama pipeline inference
* remove tie weight operation
* [pipeline inference] Fix the blocking of communication when ppsize is 2 (#4708 )
* add benchmark verbose
* fix export tokens
* fix benchmark verbose
* add P2POp style to do p2p communication
* modify schedule as p2p type when ppsize is 2
* remove unused code and add docstring
* [Pipeline inference] Refactor code, add docsting, fix bug (#4790 )
* add benchmark script
* update argparse
* fix fp16 load
* refactor code style
* add docstring
* polish code
* fix test bug
* [Pipeline inference] Add pipeline inference docs (#4817 )
* add readme doc
* add a ico
* Add performance
* update table of contents
* refactor code (#4873 )
1 year ago
Camille Zhong
652adc2215
Update README.md
1 year ago
Camille Zhong
afe10a85fd
Update README.md
1 year ago
Camille Zhong
d6c4b9b370
Update main README.md
...
add modelscope model link
1 year ago
Camille Zhong
3043d5d676
Update modelscope link in README.md
...
add modelscope link
1 year ago
flybird11111
6a21f96a87
[doc] update advanced tutorials, training gpt with hybrid parallelism ( #4866 )
...
* [doc]update advanced tutorials, training gpt with hybrid parallelism
* [doc]update advanced tutorials, training gpt with hybrid parallelism
* update vit tutorials
* update vit tutorials
* update vit tutorials
* update vit tutorials
* update en/train_vit_with_hybrid_parallel.py
* fix
* resolve comments
* fix
1 year ago
Blagoy Simandoff
8aed02b957
[nfc] fix minor typo in README ( #4846 )
1 year ago
Camille Zhong
cd6a962e66
[NFC] polish code style ( #4799 )
1 year ago
Michelle
07ed155e86
[NFC] polish colossalai/inference/quant/gptq/cai_gptq/__init__.py code style ( #4792 )
1 year ago
littsk
eef96e0877
polish code for gptq ( #4793 )
1 year ago
Hongxin Liu
cb3a25a062
[checkpointio] hotfix torch 2.0 compatibility ( #4824 )
1 year ago
ppt0011
ad23460cf8
Merge pull request #4856 from KKZ20/test/model_support_for_low_level_zero
...
[test] remove the redundant code of model output transformation in torchrec
1 year ago
ppt0011
81ee91f2ca
Merge pull request #4858 from Shawlleyw/main
...
[doc]: typo in document of booster low_level_zero plugin
1 year ago
shaoyuw
c97a3523db
fix: typo in comment of low_level_zero plugin
1 year ago
Zhongkai Zhao
db40e086c8
[test] modify model supporting part of low_level_zero plugin (including correspoding docs)
1 year ago
Xu Kai
d1fcc0fa4d
[infer] fix test bug ( #4838 )
...
* fix test bug
* delete useless code
* fix typo
1 year ago
Jianghai
013a4bedf0
[inference]fix import bug and delete down useless init ( #4830 )
...
* fix import bug and release useless init
* fix
* fix
* fix
1 year ago
Yuanheng Zhao
573f270537
[Infer] Serving example w/ ray-serve (multiple GPU case) ( #4841 )
...
* fix imports
* add ray-serve with Colossal-Infer tp
* trivial: send requests script
* add README
* fix worker port
* fix readme
* use app builder and autoscaling
* trivial: input args
* clean code; revise readme
* testci (skip example test)
* use auto model/tokenizer
* revert imports fix (fixed in other PRs)
1 year ago
Yuanheng Zhao
3a74eb4b3a
[Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) ( #4771 )
...
* add Colossal-Inference serving example w/ TorchServe
* add dockerfile
* fix dockerfile
* fix dockerfile: fix commit hash, install curl
* refactor file structure
* revise readme
* trivial
* trivial: dockerfile format
* clean dir; revise readme
* fix comments: fix imports and configs
* fix formats
* remove unused requirements
1 year ago
Tong Li
ed06731e00
update Colossal ( #4832 )
1 year ago
Xu Kai
c3bef20478
add autotune ( #4822 )
1 year ago
binmakeswell
822051d888
[doc] update slack link ( #4823 )
1 year ago
Yuanchen
1fa8c5e09f
Update Qwen-7B results ( #4821 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
1 year ago
flybird11111
be400a0936
[chat] fix gemini strategy ( #4698 )
...
* [chat] fix gemini strategy
* [chat] fix gemini strategy
* [chat] fix gemini strategy
* [chat] fix gemini strategy
* g# This is a combination of 2 commits.
[chat] fix gemini strategy
fox
* [chat] fix gemini strategy
update llama2 example
[chat] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* fix
* fix
* fix
* fix
* fix
* Update train_prompts.py
1 year ago
Tong Li
bbbcac26e8
fix format ( #4815 )
1 year ago
github-actions[bot]
fb46d05cdf
[format] applied code formatting on changed files in pull request 4595 ( #4602 )
...
Co-authored-by: github-actions <github-actions@github.com>
1 year ago
littsk
11f1e426fe
[hotfix] Correct several erroneous code comments ( #4794 )
1 year ago