Commit Graph

324 Commits (7a60161035e937db04a4521adfc8306d091b857e)

Author SHA1 Message Date
Hongxin Liu eb4f2d90f9
[llama] polish training script and fix optim ckpt (#5368)
10 months ago
Camille Zhong a5756a8720
[eval] update llama npu eval (#5366)
10 months ago
Camille Zhong 44ca61a22b
[llama] fix neftune & pbar with start_step (#5364)
10 months ago
Hongxin Liu a4cec1715b
[llama] add flash attn patch for npu (#5362)
10 months ago
Hongxin Liu 73f9f23fc6
[llama] update training script (#5360)
10 months ago
Hongxin Liu 6c0fa7b9a8
[llama] fix dataloader for hybrid parallel (#5358)
10 months ago
YeAnbang c5239840e6
[Chat] fix sft loss nan (#5345)
10 months ago
Frank Lee 8823cc4831
Merge pull request #5310 from hpcaitech/feature/npu
10 months ago
李文军 ec912b1ba9
[NFC] polish applications/Colossal-LLaMA-2/colossal_llama2/tokenizer/init_tokenizer.py code style (#5228)
10 months ago
Desperado-Jia ddf879e2db
fix bug for mefture (#5299)
10 months ago
Michelle 32cb74493a
fix auto loading gpt2 tokenizer (#5279)
10 months ago
ver217 148469348a Merge branch 'main' into sync/npu
10 months ago
digger yu 756c400ad2
fix typo in applications/ColossalEval/README.md (#5250)
11 months ago
digger yu 41e52c1c6e
[doc] fix typo in Colossal-LLaMA-2/README.md (#5247)
11 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
binmakeswell 7bc6969ce6
[doc] SwiftInfer release (#5236)
11 months ago
github-actions[bot] 4fb4a22a72
[format] applied code formatting on changed files in pull request 5234 (#5235)
11 months ago
binmakeswell b9b32b15e6
[doc] add Colossal-LLaMA-2-13B (#5234)
11 months ago
Camille Zhong 915b4652f3
[doc] Update README.md of Colossal-LLAMA2 (#5233)
11 months ago
Tong Li d992b55968
[Colossal-LLaMA-2] Release Colossal-LLaMA-2-13b-base model (#5224)
11 months ago
Yuanchen eae01b6740
Improve logic for selecting metrics (#5196)
11 months ago
BlueRum af952673f7
polish readme in application/chat (#5194)
11 months ago
Yuanchen 3ff60d13b0
Fix ColossalEval (#5186)
12 months ago
Yuanchen cefdc32615
[ColossalEval] Support GSM, Data Leakage Evaluation and Tensor Parallel (#5169)
12 months ago
Michelle b07a6f4e27
[colossalqa] fix pangu api (#5170)
12 months ago
Yuanchen b397104438
[Colossal-Llama-2] Add finetuning Colossal-Llama-2 example (#4878)
12 months ago
Michelle 368b5e3d64
[doc] fix colossalqa document (#5146)
12 months ago
Michelle c7fd9a5213
[ColossalQA] refactor server and webui & add new feature (#5138)
1 year ago
github-actions[bot] f6731db67c
[format] applied code formatting on changed files in pull request 5115 (#5118)
1 year ago
digger yu 9110406a47
fix typo change JOSNL TO JSONL etc. (#5116)
1 year ago
Zian(Andy) Zheng 7b789f4dd2 [FEATURE] Add Safety Eval Datasets to ColossalEval (#5095)
1 year ago
digger yu d5661f0f25
[nfc] fix typo change directoty to directory (#5111)
1 year ago
YeAnbang e53e729d8e
[Feature] Add document retrieval QA (#5020)
1 year ago
Orion-Zheng 43ad0d9ef0 fix wrong EOS token in ColossalChat
1 year ago
Yuanchen 239cd92eff
Support mtbench (#5025)
1 year ago
Yuanchen abe071b663
fix ColossalEval (#4992)
1 year ago
github-actions[bot] a41cf88e9b
[format] applied code formatting on changed files in pull request 4908 (#4918)
1 year ago
Zian(Andy) Zheng 7768afbad0 Update flash_attention_patch.py
1 year ago
Camille Zhong 652adc2215 Update README.md
1 year ago
Camille Zhong afe10a85fd Update README.md
1 year ago
Camille Zhong 3043d5d676 Update modelscope link in README.md
1 year ago
Tong Li ed06731e00
update Colossal (#4832)
1 year ago
binmakeswell 822051d888
[doc] update slack link (#4823)
1 year ago
Yuanchen 1fa8c5e09f
Update Qwen-7B results (#4821)
1 year ago
flybird11111 be400a0936
[chat] fix gemini strategy (#4698)
1 year ago
Chandler-Bing b6cf0aca55
[hotfix] change llama2 Colossal-LLaMA-2 script filename (#4800)
1 year ago
Tong Li 8cbce6184d update
1 year ago
Tong Li bd014673b0 update readme
1 year ago
binmakeswell d512a4d38d
[doc] add llama2 domain-specific solution news (#4789)
1 year ago
Yuanchen ce777853ae
[feature] ColossalEval: Evaluation Pipeline for LLMs (#4786)
1 year ago
Tong Li 74aa7d964a
initial commit: add colossal llama 2 (#4784)
1 year ago
Wenhao Chen 901ab1eedd
[chat]: add lora merge weights config (#4766)
1 year ago
Wenhao Chen 7b9b86441f
[chat]: update rm, add wandb and fix bugs (#4471)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
digger yu e4fc57c3de
Optimized some syntax errors in the documentation and code under applications/ (#4127)
1 year ago
Hongxin Liu a39a5c66fe
Merge branch 'main' into feature/shardformer
1 year ago
Ying Liu c648dc093f fix colossalai version in coati examples
1 year ago
yingliu-hpc 1467e3b41b
[coati] add chatglm model (#4539)
1 year ago
Michelle 285fe7ba71
[chat] update config and prompt (#4139)
1 year ago
Hongxin Liu 26e29d58f0
[devops] add large-scale distributed test marker (#4452)
1 year ago
Wenhao Chen 6d41c3f2aa
[doc] update Coati README (#4405)
1 year ago
Wenhao Chen da4f7b855f
[chat] fix bugs and add unit tests (#4213)
1 year ago
Wenhao Chen 75c5389037
[chat] fix compute_approx_kl (#4338)
1 year ago
Yuanchen 5187c96b7c
support session-based training (#4313)
1 year ago
yuxuan-lou 0991405361 [NFC] polish applications/Chat/coati/models/utils.py codestyle (#4277)
1 year ago
Zirui Zhu 9e512938f6 [NFC] polish applications/Chat/coati/trainer/strategies/base.py code style (#4278)
1 year ago
Ziheng Qin c972d65311 applications/Chat/.gitignore (#4279)
1 year ago
RichardoLuo 709e121cd5 [NFC] polish applications/Chat/coati/models/generation.py code style (#4275)
1 year ago
Yuanchen dc1b6127f9 [NFC] polish applications/Chat/inference/server.py code style (#4274)
1 year ago
アマデウス caa4433072 [NFC] fix format of application/Chat/coati/trainer/utils.py (#4273)
1 year ago
Xu Kai 1ce997daaf [NFC] polish applications/Chat/examples/train_reward_model.py code style (#4271)
1 year ago
shenggan 798cb72907 [NFC] polish applications/Chat/coati/trainer/base.py code style (#4260)
1 year ago
Zheng Zangwei (Alex Zheng) b2debdc09b [NFC] polish applications/Chat/coati/dataset/sft_dataset.py code style (#4259)
1 year ago
CZYCW dee1c96344 [NFC] policy applications/Chat/examples/ray/mmmt_prompt.py code style (#4250)
1 year ago
Junming Wu 77c469e1ba [NFC] polish applications/Chat/coati/models/base/actor.py code style (#4248)
1 year ago
Camille Zhong 915ed8bed1 [NFC] polish applications/Chat/inference/requirements.txt code style (#4265)
1 year ago
Frank Lee f447ca1811 [chat] removed cache file (#4155)
1 year ago
wukong1992 c1c672d0f0 [shardformer] shardformer support t5 model (#3994)
1 year ago
Wenhao Chen 3d8d5d0d58
[chat] use official transformers and fix some issues (#4117)
1 year ago
Wenhao Chen edd75a59ea
[chat] remove naive strategy and split colossalai strategy (#4094)
1 year ago
Wenhao Chen b03d64d010
[chat] refactor trainer class (#4080)
1 year ago
Baizhou Zhang 4da324cd60
[hotfix]fix argument naming in docs and examples (#4083)
1 year ago
Michelle e89b127d8e
[chat]: fix chat evaluation possible bug (#4064)
1 year ago
Wenhao Chen 153b957a1b
[chat] refactor strategy class with booster api (#3987)
1 year ago
digger yu 727c4598a9
[nfc] fix dim not defined and fix typo (#3991)
1 year ago
digger yu d4fb7bfda7
fix typo applications/Chat/coati/ (#3947)
1 year ago
Yuanchen 2925f47399
[evaluate] support gpt evaluation with reference (#3972)
1 year ago
Wenhao Chen 9d02590c9a
[chat] refactor actor class (#3968)
1 year ago
Yuanchen 21c4c0b1a0
support UniEval and add CHRF metric (#3924)
1 year ago
Hongxin Liu b5f0566363
[chat] add distributed PPO trainer (#3740)
1 year ago
Yuanchen 57a6d7685c
support evaluation for english (#3880)
1 year ago
Yuanchen 2506e275b8
[evaluation] improvement on evaluation (#3862)
2 years ago
digger yu e2d81eba0d
[nfc] fix typo colossalai/ applications/ (#3831)
2 years ago
Yuanchen 34966378e8
[evaluation] add automatic evaluation pipeline (#3821)
2 years ago
digger yu 9265f2d4d7
[NFC]fix typo colossalai/auto_parallel nn utils etc. (#3779)
2 years ago
github-actions[bot] 62c7e67f9f
[format] applied code formatting on changed files in pull request 3786 (#3787)
2 years ago
binmakeswell ad2cf58f50
[chat] add performance and tutorial (#3786)
2 years ago
Yuanchen 05759839bd
[chat] fix bugs in stage 3 training (#3759)
2 years ago
digger-yu ad6460cf2c
[NFC] fix typo applications/ and colossalai/ (#3735)
2 years ago
digger-yu b7141c36dd
[CI] fix some spelling errors (#3707)
2 years ago
MisterLin1995 f7361ee1bd
[chat] fix community example ray (#3719)
2 years ago
zhang-yi-chi 2da5d81dec
[chat] fix train_prompts.py gemini strategy bug (#3666)
2 years ago
digger-yu 65bdc3159f
fix some spelling error with applications/Chat/examples/ (#3692)
2 years ago
Tong Li b36e67cb2b
Merge pull request #3680 from digger-yu/digger-yu-patch-2
2 years ago
Camille Zhong 0f785cb1f3
[chat] PPO stage3 doc enhancement (#3679)
2 years ago
digger-yu 6650daeb0a
[doc] fix chat spelling error (#3671)
2 years ago
Hongxin Liu 7bd0bee8ea
[chat] add opt attn kernel (#3655)
2 years ago
digger-yu 8ba7858753
Update generate_gpt35_answers.py
2 years ago
digger-yu bfbf650588
fix spelling error
2 years ago
tanitna 1a60dc07a8
[chat] typo accimulation_steps -> accumulation_steps (#3662)
2 years ago
Tong Li 816add7e7f
Merge pull request #3656 from TongLi3701/chat/update_eval
2 years ago
binmakeswell 268b3cd80d
[chat] set default zero2 strategy (#3667)
2 years ago
Tong Li c1a355940e update readme
2 years ago
Tong Li ed3eaa6922 update documentation
2 years ago
Tong Li c419117329 update questions and readme
2 years ago
Tong Li aa77ddae33 remove unnecessary step and update readme
2 years ago
Hongxin Liu 842768a174
[chat] refactor model save/load logic (#3654)
2 years ago
Hongxin Liu 6ef7011462
[chat] remove lm model class (#3653)
2 years ago
Camille Zhong 8bccb72c8d
[Doc] enhancement on README.md for chat examples (#3646)
2 years ago
Hongxin Liu 2a951955ad
[chat] refactor trainer (#3648)
2 years ago
Hongxin Liu f8288315d9
[chat] polish performance evaluator (#3647)
2 years ago
Hongxin Liu 50793b35f4
[gemini] accelerate inference (#3641)
2 years ago
Tong Li e1b0a78afa
Merge pull request #3621 from zhang-yi-chi/fix/chat-train-prompts-single-gpu
2 years ago
ddobokki df309fc6ab
[Chat] Remove duplicate functions (#3625)
2 years ago
zhang-yi-chi 739cfe3360 [chat] fix enable single gpu training bug
2 years ago
digger-yu d7bf284706
[chat] polish code note typo (#3612)
2 years ago
Yuanchen c4709d34cf
Chat evaluate (#3608)
2 years ago
binmakeswell 5a79cffdfd
[coati] fix install cmd (#3592)
2 years ago
Yuanchen 1ec0d386a9
reconstruct chat trainer and fix training script (#3588)
2 years ago
Camille Zhong 36a519b49f Update test_ci.sh
2 years ago
tingfeng cao 7788e0b0a5
fix: fix sft (#3568)
2 years ago
Fazzie-Maqianli 6b1a39b17b
[coati] add costom model suppor tguide (#3579)
2 years ago
binmakeswell cc1eec2f53
[chat] update reward model sh (#3578)
2 years ago
csric e355144375
[chatgpt] Detached PPO Training (#3195)
2 years ago
MisterLin1995 1a809eddaa
[chat] ChatGPT train prompts on ray example (#3309)
2 years ago
binmakeswell 535b896435
[chat] polish tutorial doc (#3551)
2 years ago
Yuanchen 7182ac2a04
[chat]add examples of training with limited resources in chat readme (#3536)
2 years ago
zhang-yi-chi e6a132a449
[chat]: add vf_coef argument for PPOTrainer (#3318)
2 years ago
ver217 89fd10a1c9
[chat] add zero2 cpu strategy for sft training (#3520)
2 years ago
binmakeswell 990d4c3e4e
[doc] hide diffusion in application path (#3519)
2 years ago
binmakeswell 0c0455700f
[doc] add requirement and highlight application (#3516)
2 years ago
NatalieC323 635d0a1baf
[Chat Community] Update README.md (fixed#3487) (#3506)
2 years ago
gongenlei a7ca297281
[coati] Fix LlamaCritic (#3475)
2 years ago
binmakeswell 891b8e7fac
[chat] fix stage3 PPO sample sh command (#3477)
2 years ago
Fazzie-Maqianli 6afeb1202a
add community example dictionary (#3465)
2 years ago
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452)
2 years ago
YY Lin 62f4e2eb07
[Chat]Add Peft support & fix the ptx bug (#3433)
2 years ago
Dr-Corgi 73afb63594
[chat]fix save_model(#3377)
2 years ago
kingkingofall 57a3c4db6d
[chat]fix readme (#3429)
2 years ago
Camille Zhong 72cb4dd433
[Chat] fix the tokenizer "int too big to convert" error in SFT training (#3453)
2 years ago