ppt0011
c1fab951e7
Merge pull request #4889 from ppt0011/main
...
[doc] add reminder for issue encountered with hybrid adam
2023-10-12 10:27:10 +08:00
littsk
ffd9a3cbc9
[hotfix] fix bug in sequence parallel test ( #4887 )
2023-10-11 19:30:41 +08:00
ppt0011
1dcaf249bd
[doc] add reminder for issue encountered with hybrid adam
2023-10-11 17:51:14 +08:00
Xu Kai
fdec650bb4
fix test llama ( #4884 )
2023-10-11 17:43:01 +08:00
Bin Jia
08a9f76b2f
[Pipeline Inference] Sync pipeline inference branch to main ( #4820 )
...
* [pipeline inference] pipeline inference (#4492 )
* add pp stage manager as circle stage
* fix a bug when create process group
* add ppinfer basic framework
* add micro batch manager and support kvcache-pp gpt2 fwd
* add generate schedule
* use mb size to control mb number
* support generate with kv cache
* add output, remove unused code
* add test
* reuse shardformer to build model
* refactor some code and use the same attribute name of hf
* fix review and add test for generation
* remove unused file
* fix CI
* add cache clear
* fix code error
* fix typo
* [Pipeline inference] Modify to tieweight (#4599 )
* add pp stage manager as circle stage
* fix a bug when create process group
* add ppinfer basic framework
* add micro batch manager and support kvcache-pp gpt2 fwd
* add generate schedule
* use mb size to control mb number
* support generate with kv cache
* add output, remove unused code
* add test
* reuse shardformer to build model
* refactor some code and use the same attribute name of hf
* fix review and add test for generation
* remove unused file
* modify the way of saving newtokens
* modify to tieweight
* modify test
* remove unused file
* solve review
* add docstring
* [Pipeline inference] support llama pipeline inference (#4647 )
* support llama pipeline inference
* remove tie weight operation
* [pipeline inference] Fix the blocking of communication when ppsize is 2 (#4708 )
* add benchmark verbose
* fix export tokens
* fix benchmark verbose
* add P2POp style to do p2p communication
* modify schedule as p2p type when ppsize is 2
* remove unused code and add docstring
* [Pipeline inference] Refactor code, add docsting, fix bug (#4790 )
* add benchmark script
* update argparse
* fix fp16 load
* refactor code style
* add docstring
* polish code
* fix test bug
* [Pipeline inference] Add pipeline inference docs (#4817 )
* add readme doc
* add a ico
* Add performance
* update table of contents
* refactor code (#4873 )
2023-10-11 11:40:06 +08:00
Camille Zhong
652adc2215
Update README.md
2023-10-10 23:19:34 +08:00
Camille Zhong
afe10a85fd
Update README.md
2023-10-10 23:19:34 +08:00
Camille Zhong
d6c4b9b370
Update main README.md
...
add modelscope model link
2023-10-10 23:19:34 +08:00
Camille Zhong
3043d5d676
Update modelscope link in README.md
...
add modelscope link
2023-10-10 23:19:34 +08:00
flybird11111
6a21f96a87
[doc] update advanced tutorials, training gpt with hybrid parallelism ( #4866 )
...
* [doc]update advanced tutorials, training gpt with hybrid parallelism
* [doc]update advanced tutorials, training gpt with hybrid parallelism
* update vit tutorials
* update vit tutorials
* update vit tutorials
* update vit tutorials
* update en/train_vit_with_hybrid_parallel.py
* fix
* resolve comments
* fix
2023-10-10 08:18:55 +00:00
Blagoy Simandoff
8aed02b957
[nfc] fix minor typo in README ( #4846 )
2023-10-07 17:51:11 +08:00
Camille Zhong
cd6a962e66
[NFC] polish code style ( #4799 )
2023-10-07 13:36:52 +08:00
Michelle
07ed155e86
[NFC] polish colossalai/inference/quant/gptq/cai_gptq/__init__.py code style ( #4792 )
2023-10-07 13:36:52 +08:00
littsk
eef96e0877
polish code for gptq ( #4793 )
2023-10-07 13:36:52 +08:00
Hongxin Liu
cb3a25a062
[checkpointio] hotfix torch 2.0 compatibility ( #4824 )
2023-10-07 10:45:52 +08:00
ppt0011
ad23460cf8
Merge pull request #4856 from KKZ20/test/model_support_for_low_level_zero
...
[test] remove the redundant code of model output transformation in torchrec
2023-10-06 09:32:33 +08:00
ppt0011
81ee91f2ca
Merge pull request #4858 from Shawlleyw/main
...
[doc]: typo in document of booster low_level_zero plugin
2023-10-06 09:27:54 +08:00
shaoyuw
c97a3523db
fix: typo in comment of low_level_zero plugin
2023-10-05 16:30:34 +00:00
Zhongkai Zhao
db40e086c8
[test] modify model supporting part of low_level_zero plugin (including correspoding docs)
2023-10-05 15:10:31 +08:00
Xu Kai
d1fcc0fa4d
[infer] fix test bug ( #4838 )
...
* fix test bug
* delete useless code
* fix typo
2023-10-04 10:01:03 +08:00
Jianghai
013a4bedf0
[inference]fix import bug and delete down useless init ( #4830 )
...
* fix import bug and release useless init
* fix
* fix
* fix
2023-10-04 09:18:45 +08:00
Yuanheng Zhao
573f270537
[Infer] Serving example w/ ray-serve (multiple GPU case) ( #4841 )
...
* fix imports
* add ray-serve with Colossal-Infer tp
* trivial: send requests script
* add README
* fix worker port
* fix readme
* use app builder and autoscaling
* trivial: input args
* clean code; revise readme
* testci (skip example test)
* use auto model/tokenizer
* revert imports fix (fixed in other PRs)
2023-10-02 17:48:38 +08:00
Yuanheng Zhao
3a74eb4b3a
[Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) ( #4771 )
...
* add Colossal-Inference serving example w/ TorchServe
* add dockerfile
* fix dockerfile
* fix dockerfile: fix commit hash, install curl
* refactor file structure
* revise readme
* trivial
* trivial: dockerfile format
* clean dir; revise readme
* fix comments: fix imports and configs
* fix formats
* remove unused requirements
2023-10-02 17:42:37 +08:00
Tong Li
ed06731e00
update Colossal ( #4832 )
2023-09-28 16:05:05 +08:00
Xu Kai
c3bef20478
add autotune ( #4822 )
2023-09-28 13:47:35 +08:00
binmakeswell
822051d888
[doc] update slack link ( #4823 )
2023-09-27 17:37:39 +08:00
Yuanchen
1fa8c5e09f
Update Qwen-7B results ( #4821 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-09-27 17:33:54 +08:00
flybird11111
be400a0936
[chat] fix gemini strategy ( #4698 )
...
* [chat] fix gemini strategy
* [chat] fix gemini strategy
* [chat] fix gemini strategy
* [chat] fix gemini strategy
* g# This is a combination of 2 commits.
[chat] fix gemini strategy
fox
* [chat] fix gemini strategy
update llama2 example
[chat] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* [fix] fix gemini strategy
* fix
* fix
* fix
* fix
* fix
* Update train_prompts.py
2023-09-27 13:15:32 +08:00
Tong Li
bbbcac26e8
fix format ( #4815 )
2023-09-27 12:50:22 +08:00
github-actions[bot]
fb46d05cdf
[format] applied code formatting on changed files in pull request 4595 ( #4602 )
...
Co-authored-by: github-actions <github-actions@github.com>
2023-09-27 10:45:03 +08:00
littsk
11f1e426fe
[hotfix] Correct several erroneous code comments ( #4794 )
2023-09-27 10:43:03 +08:00
littsk
54b3ad8924
[hotfix] fix norm type error in zero optimizer ( #4795 )
2023-09-27 10:35:24 +08:00
Hongxin Liu
da15fdb9ca
[doc] add lazy init docs ( #4808 )
2023-09-27 10:24:04 +08:00
Yan haixu
a22706337a
[misc] add last_epoch in CosineAnnealingWarmupLR ( #4778 )
2023-09-26 14:43:46 +08:00
Chandler-Bing
b6cf0aca55
[hotfix] change llama2 Colossal-LLaMA-2 script filename ( #4800 )
...
change filename:
pretraining.py -> trainin.py
there is no file named pretraing.py. wrong writing
2023-09-26 11:44:27 +08:00
Desperado-Jia
62b6af1025
Merge pull request #4805 from TongLi3701/docs/fix
...
[doc] Update TODO in README of Colossal-LLaMA-2
2023-09-26 11:39:35 +08:00
Tong Li
8cbce6184d
update
2023-09-26 11:36:53 +08:00
Hongxin Liu
4965c0dabd
[lazy] support from_pretrained ( #4801 )
...
* [lazy] patch from pretrained
* [lazy] fix from pretrained and add tests
* [devops] update ci
2023-09-26 11:04:11 +08:00
Tong Li
bd014673b0
update readme
2023-09-26 10:58:05 +08:00
Baizhou Zhang
64a08b2dc3
[checkpointio] support unsharded checkpointIO for hybrid parallel ( #4774 )
...
* support unsharded saving/loading for model
* support optimizer unsharded saving
* update doc
* support unsharded loading for optimizer
* small fix
2023-09-26 10:58:03 +08:00
Baizhou Zhang
a2db75546d
[doc] polish shardformer doc ( #4779 )
...
* fix example format in docstring
* polish shardformer doc
2023-09-26 10:57:47 +08:00
flybird11111
26cd6d850c
[fix] fix weekly runing example ( #4787 )
...
* [fix] fix weekly runing example
* [fix] fix weekly runing example
2023-09-25 16:19:33 +08:00
binmakeswell
d512a4d38d
[doc] add llama2 domain-specific solution news ( #4789 )
...
* [doc] add llama2 domain-specific solution news
2023-09-25 10:44:15 +08:00
Yuanchen
ce777853ae
[feature] ColossalEval: Evaluation Pipeline for LLMs ( #4786 )
...
* Add ColossalEval
* Delete evaluate in Chat
---------
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
2023-09-24 23:14:11 +08:00
Tong Li
74aa7d964a
initial commit: add colossal llama 2 ( #4784 )
2023-09-24 23:12:26 +08:00
Hongxin Liu
4146f1c0ce
[release] update version ( #4775 )
...
* [release] update version
* [doc] revert versions
2023-09-22 18:29:17 +08:00
Jianghai
ce7ade3882
[inference] chatglm2 infer demo ( #4724 )
...
* add chatglm2
* add
* gather needed kernels
* fix some bugs
* finish context forward
* finish context stage
* fix
* add
* pause
* add
* fix bugs
* finish chatglm
* fix bug
* change some logic
* fix bugs
* change some logics
* add
* add
* add
* fix
* fix tests
* fix
2023-09-22 11:12:50 +08:00
Xu Kai
946ab56c48
[feature] add gptq for inference ( #4754 )
...
* [gptq] add gptq kernel (#4416 )
* add gptq
* refactor code
* fix tests
* replace auto-gptq
* rname inferance/quant
* refactor test
* add auto-gptq as an option
* reset requirements
* change assert and check auto-gptq
* add import warnings
* change test flash attn version
* remove example
* change requirements of flash_attn
* modify tests
* [skip ci] change requirements-test
* [gptq] faster gptq cuda kernel (#4494 )
* [skip ci] add cuda kernels
* add license
* [skip ci] fix max_input_len
* format files & change test size
* [skip ci]
* [gptq] add gptq tensor parallel (#4538 )
* add gptq tensor parallel
* add gptq tp
* delete print
* add test gptq check
* add test auto gptq check
* [gptq] combine gptq and kv cache manager (#4706 )
* combine gptq and kv cache manager
* add init bits
* delete useless code
* add model path
* delete usless print and update test
* delete usless import
* move option gptq to shard config
* change replace linear to shardformer
* update bloom policy
* delete useless code
* fix import bug and delete uselss code
* change colossalai/gptq to colossalai/quant/gptq
* update import linear for tests
* delete useless code and mv gptq_kernel to kernel directory
* fix triton kernel
* add triton import
2023-09-22 11:02:50 +08:00
littsk
1e0e080837
[bug] Fix the version check bug in colossalai run when generating the cmd. ( #4713 )
...
* Fix the version check bug in colossalai run when generating the cmd.
* polish code
2023-09-22 10:50:47 +08:00
Hongxin Liu
3e05c07bb8
[lazy] support torch 2.0 ( #4763 )
...
* [lazy] support _like methods and clamp
* [lazy] pass transformers models
* [lazy] fix device move and requires grad
* [lazy] fix requires grad and refactor api
* [lazy] fix requires grad
2023-09-21 16:30:23 +08:00