Commit Graph

188 Commits (b8e770c832276d212673fe3d7f41a6ce2ee40858)

Author SHA1 Message Date
Yuanchen 2925f47399
[evaluate] support gpt evaluation with reference (#3972)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-06-13 15:12:29 +08:00
Wenhao Chen 9d02590c9a
[chat] refactor actor class (#3968)
* refactor: separate log_probs fn from Actor forward fn

* refactor: separate generate fn from Actor class

* feat: update unwrap_model and get_base_model
* unwrap_model returns model not wrapped by Strategy
* get_base_model returns HF model for Actor, Critic and RewardModel

* feat: simplify Strategy.prepare

* style: remove get_base_model method of Actor

* perf: tokenize text in batches

* refactor: move calc_action_log_probs to utils of model

* test: update test with new forward fn

* style: rename forward fn args

* fix: do not unwrap model in save_model fn of naive strategy

* test: add gemini test for train_prompts

* fix: fix _set_default_generate_kwargs
2023-06-13 13:31:56 +08:00
Yuanchen 21c4c0b1a0
support UniEval and add CHRF metric (#3924)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-06-08 17:38:47 +08:00
Hongxin Liu b5f0566363
[chat] add distributed PPO trainer (#3740)
* Detached ppo (#9)

* run the base

* working on dist ppo

* sync

* detached trainer

* update detached trainer. no maker update function

* facing init problem

* 1 maker 1 trainer detached run. but no model update

* facing cuda problem

* fix save functions

* verified maker update

* nothing

* add ignore

* analyize loss issue

* remove some debug codes

* facing 2m1t stuck issue

* 2m1t verified

* do not use torchrun

* working on 2m2t

* working on 2m2t

* initialize strategy in ray actor env

* facing actor's init order issue

* facing ddp model update issue (need unwarp ddp)

* unwrap ddp actor

* checking 1m2t stuck problem

* nothing

* set timeout for trainer choosing. It solves the stuck problem!

* delete some debug output

* rename to sync with upstream

* rename to sync with upstream

* coati rename

* nothing

* I am going to detach the replaybuffer from trainer and make it a Ray Actor. Two benefits: 1. support TP trainer. 2. asynchronized buffer operations

* experience_maker_holder performs target-revolving _send_experience() instead of length comparison.

* move code to ray subfolder

* working on pipeline inference

* apply comments

* working on pipeline strategy. in progress.

* remove pipeline code. clean this branch

* update remote parameters by state_dict. no test

* nothing

* state_dict sharding transfer

* merge debug branch

* gemini _unwrap_model fix

* simplify code

* simplify code & fix LoRALinear AttributeError

* critic unwrapped state_dict

---------

Co-authored-by: csric <richcsr256@gmail.com>

* [chat] add perfomance evaluator and fix bugs (#10)

* [chat] add performance evaluator for ray

* [chat] refactor debug arg

* [chat] support hf config

* [chat] fix generation

* [chat] add 1mmt dummy example

* [chat] fix gemini ckpt

* split experience to send (#11)

Co-authored-by: csric <richcsr256@gmail.com>

* [chat] refactor trainer and maker (#12)

* [chat] refactor experience maker holder

* [chat] refactor model init

* [chat] refactor trainer args

* [chat] refactor model init

* [chat] refactor trainer

* [chat] refactor experience sending logic and training loop args (#13)

* [chat] refactor experience send logic

* [chat] refactor trainer

* [chat] refactor trainer

* [chat] refactor experience maker

* [chat] refactor pbar

* [chat] refactor example folder (#14)

* [chat] support quant (#15)

* [chat] add quant

* [chat] add quant example

* prompt example (#16)

* prompt example

* prompt load csv data

* remove legacy try

---------

Co-authored-by: csric <richcsr256@gmail.com>

* [chat] add mmmt dummy example and refactor experience sending (#17)

* [chat] add mmmt dummy example

* [chat] refactor naive strategy

* [chat] fix struck problem

* [chat] fix naive strategy

* [chat] optimize experience maker sending logic

* [chat] refactor sending assignment

* [chat] refactor performance evaluator (#18)

* Prompt Example & requires_grad state_dict & sharding state_dict (#19)

* prompt example

* prompt load csv data

* remove legacy try

* maker models require_grad set to False

* working on zero redundancy update

* mmmt_prompt example; naive strategy requires_grad state_dict & sharding; maker model requires_no_grad.

* remove legacy examples

* remove legacy examples

* remove replay buffer tp state. bad design

---------

Co-authored-by: csric <richcsr256@gmail.com>

* state_dict sending adapts to new unwrap function (#20)

* prompt example

* prompt load csv data

* remove legacy try

* maker models require_grad set to False

* working on zero redundancy update

* mmmt_prompt example; naive strategy requires_grad state_dict & sharding; maker model requires_no_grad.

* remove legacy examples

* remove legacy examples

* remove replay buffer tp state. bad design

* opt benchmark

* better script

* nothing

* [chat] strategy refactor unwrap model

* [chat] strategy refactor save model

* [chat] add docstr

* [chat] refactor trainer save model

* [chat] fix strategy typing

* [chat] refactor trainer save model

* [chat] update readme

* [chat] fix unit test

* working on lora reconstruction

* state_dict sending adapts to new unwrap function

* remove comments

---------

Co-authored-by: csric <richcsr256@gmail.com>
Co-authored-by: ver217 <lhx0217@gmail.com>

* [chat-ray] add readme (#21)

* add readme

* transparent graph

* add note background

---------

Co-authored-by: csric <richcsr256@gmail.com>

* [chat] get images from url (#22)

* Refactor/chat ray (#23)

* [chat] lora add todo

* [chat] remove unused pipeline strategy

* [chat] refactor example structure

* [chat] setup ci for ray

* [chat-ray] Support LoRA trainer. LoRA weights reconstruction. (#24)

* lora support prototype

* lora support

* 1mmt lora & remove useless code

---------

Co-authored-by: csric <richcsr256@gmail.com>

* [chat] fix test ci for ray

* [chat] fix test ci requirements for ray

* [chat] fix ray runtime env

* [chat] fix ray runtime env

* [chat] fix example ci docker args

* [chat] add debug info in trainer

* [chat] add nccl debug info

* [chat] skip ray test

* [doc] fix typo

---------

Co-authored-by: csric <59389055+CsRic@users.noreply.github.com>
Co-authored-by: csric <richcsr256@gmail.com>
2023-06-07 10:41:16 +08:00
Yuanchen 57a6d7685c
support evaluation for english (#3880)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-06-05 21:24:21 +08:00
Yuanchen 2506e275b8
[evaluation] improvement on evaluation (#3862)
* fix a bug when the config file contains one category but the answer file doesn't contains that category

* fix Chinese prompt file

* support gpt-3.5-turbo and gpt-4 evaluation

* polish and update README

* resolve pr comments

---------

Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-05-30 11:48:41 +08:00
digger yu e2d81eba0d
[nfc] fix typo colossalai/ applications/ (#3831)
* fix typo colossalai/autochunk auto_parallel amp

* fix typo colossalai/auto_parallel nn utils etc.

* fix typo colossalai/auto_parallel autochunk fx/passes  etc.

* fix typo docs/

* change placememt_policy to placement_policy in docs/ and examples/

* fix typo colossalai/ applications/
2023-05-25 16:19:41 +08:00
Yuanchen 34966378e8
[evaluation] add automatic evaluation pipeline (#3821)
* add functions for gpt evaluation

* add automatic eval

Update eval.py

* using jload and modify the type of answers1 and answers2

* Update eval.py

Update eval.py

* Update evaluator.py

* support gpt evaluation

* update readme.md

update README.md

update READNE.md

modify readme.md

* add Chinese example for config, battle prompt and evaluation prompt file

* remove GPT-4 config

* remove sample folder

---------

Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>
2023-05-24 11:18:23 +08:00
digger yu 9265f2d4d7
[NFC]fix typo colossalai/auto_parallel nn utils etc. (#3779)
* fix typo colossalai/autochunk auto_parallel amp

* fix typo colossalai/auto_parallel nn utils etc.
2023-05-23 15:28:20 +08:00
github-actions[bot] 62c7e67f9f
[format] applied code formatting on changed files in pull request 3786 (#3787)
Co-authored-by: github-actions <github-actions@github.com>
2023-05-22 14:42:09 +08:00
binmakeswell ad2cf58f50
[chat] add performance and tutorial (#3786) 2023-05-19 18:03:56 +08:00
Yuanchen 05759839bd
[chat] fix bugs in stage 3 training (#3759)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-05-17 17:44:05 +08:00
digger-yu ad6460cf2c
[NFC] fix typo applications/ and colossalai/ (#3735) 2023-05-15 11:46:25 +08:00
digger-yu b7141c36dd
[CI] fix some spelling errors (#3707)
* fix spelling error with examples/comminity/

* fix spelling error with tests/

* fix some spelling error with tests/ colossalai/ etc.
2023-05-10 17:12:03 +08:00
MisterLin1995 f7361ee1bd
[chat] fix community example ray (#3719)
Co-authored-by: jiangwen <zxl265370@antgroup.com>
2023-05-10 13:36:09 +08:00
zhang-yi-chi 2da5d81dec
[chat] fix train_prompts.py gemini strategy bug (#3666)
* fix gemini strategy bug

* add comment

* add comment

* better solution
2023-05-06 16:46:38 +08:00
digger-yu 65bdc3159f
fix some spelling error with applications/Chat/examples/ (#3692)
* fix spelling error with examples/comminity/

* fix spelling error with example/
2023-05-06 11:27:23 +08:00
Tong Li b36e67cb2b
Merge pull request #3680 from digger-yu/digger-yu-patch-2
fix spelling error with applications/Chat/evaluate/
2023-05-05 16:26:04 +08:00
Camille Zhong 0f785cb1f3
[chat] PPO stage3 doc enhancement (#3679)
* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

Update test_ci.sh

Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

Update test_ci.sh

Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

update roberta with coati

chat ci update

Revert "chat ci update"

This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.

* Update README.md

Update README.md

* update readme

* Update test_ci.sh

* update readme and add a script

update readme and add a script

modify readme

Update README.md
2023-05-05 13:36:56 +08:00
digger-yu 6650daeb0a
[doc] fix chat spelling error (#3671)
* Update README.md

change "huggingaface" to "huggingface"

* Update README.md

change "Colossa-AI" to "Colossal-AI"
2023-05-05 11:37:35 +08:00
Hongxin Liu 7bd0bee8ea
[chat] add opt attn kernel (#3655)
* [chat] add opt attn kernel

* [chat] disable xformer during fwd
2023-05-04 16:03:33 +08:00
digger-yu 8ba7858753
Update generate_gpt35_answers.py
fix spelling error with generate_gpt35_answers.py
2023-05-04 15:34:16 +08:00
digger-yu bfbf650588
fix spelling error
fix spelling error with evaluate.py
2023-05-04 15:31:09 +08:00
tanitna 1a60dc07a8
[chat] typo accimulation_steps -> accumulation_steps (#3662) 2023-04-28 15:42:57 +08:00
Tong Li 816add7e7f
Merge pull request #3656 from TongLi3701/chat/update_eval
[Chat]: Remove unnecessary step and update documentation
2023-04-28 14:07:44 +08:00
binmakeswell 268b3cd80d
[chat] set default zero2 strategy (#3667)
* [chat] set default gemini strategy

* [chat] set default zero2 strategy

* [chat] set default zero2 strategy
2023-04-28 13:56:50 +08:00
Tong Li c1a355940e update readme 2023-04-28 11:56:35 +08:00
Tong Li ed3eaa6922 update documentation 2023-04-28 11:49:21 +08:00
Tong Li c419117329 update questions and readme 2023-04-27 19:04:26 +08:00
Tong Li aa77ddae33 remove unnecessary step and update readme 2023-04-27 18:51:58 +08:00
Hongxin Liu 842768a174
[chat] refactor model save/load logic (#3654)
* [chat] strategy refactor unwrap model

* [chat] strategy refactor save model

* [chat] add docstr

* [chat] refactor trainer save model

* [chat] fix strategy typing

* [chat] refactor trainer save model

* [chat] update readme

* [chat] fix unit test
2023-04-27 18:41:49 +08:00
Hongxin Liu 6ef7011462
[chat] remove lm model class (#3653)
* [chat] refactor lora

* [chat] remove lm class

* [chat] refactor save model

* [chat] refactor train sft

* [chat] fix ci

* [chat] fix ci
2023-04-27 15:37:38 +08:00
Camille Zhong 8bccb72c8d
[Doc] enhancement on README.md for chat examples (#3646)
* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

Update test_ci.sh

Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

Update test_ci.sh

Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

update roberta with coati

chat ci update

Revert "chat ci update"

This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.

* Update README.md

Update README.md

* update readme

* Update test_ci.sh
2023-04-27 14:26:19 +08:00
Hongxin Liu 2a951955ad
[chat] refactor trainer (#3648)
* [chat] ppo trainer remove useless args

* [chat] update examples

* [chat] update benchmark

* [chat] update examples

* [chat] fix sft training with wandb

* [chat] polish docstr
2023-04-26 18:11:49 +08:00
Hongxin Liu f8288315d9
[chat] polish performance evaluator (#3647) 2023-04-26 17:34:59 +08:00
Hongxin Liu 50793b35f4
[gemini] accelerate inference (#3641)
* [gemini] support don't scatter after inference

* [chat] update colossalai strategy

* [chat] fix opt benchmark

* [chat] update opt benchmark

* [gemini] optimize inference

* [test] add gemini inference test

* [chat] fix unit test ci

* [chat] fix ci

* [chat] fix ci

* [chat] skip checkpoint test
2023-04-26 16:32:40 +08:00
Tong Li e1b0a78afa
Merge pull request #3621 from zhang-yi-chi/fix/chat-train-prompts-single-gpu
[chat] fix single gpu training bug in examples/train_prompts.py
2023-04-24 22:13:54 +08:00
ddobokki df309fc6ab
[Chat] Remove duplicate functions (#3625) 2023-04-24 12:23:15 +08:00
zhang-yi-chi 739cfe3360 [chat] fix enable single gpu training bug 2023-04-22 14:16:08 +08:00
digger-yu d7bf284706
[chat] polish code note typo (#3612) 2023-04-20 17:22:15 +08:00
Yuanchen c4709d34cf
Chat evaluate (#3608)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-20 11:12:24 +08:00
binmakeswell 5a79cffdfd
[coati] fix install cmd (#3592) 2023-04-18 18:19:48 +08:00
Yuanchen 1ec0d386a9
reconstruct chat trainer and fix training script (#3588)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-18 16:44:03 +08:00
Camille Zhong 36a519b49f Update test_ci.sh
update

Update test_ci.sh

Update test_ci.sh

Update test_ci.sh

Update test_ci.sh

Update test_ci.sh

Update test_ci.sh

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update test_ci.sh

Update test_ci.sh

update

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

update ci

Update test_ci.sh

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml

Update test_ci.sh

Update test_ci.sh

Update run_chatgpt_examples.yml

Update test_ci.sh

Update test_ci.sh

Update test_ci.sh

update test ci

RoBERTa for RLHF Stage 2 & 3 (still in testing)

Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

Update test_ci.sh

Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

Update test_ci.sh

Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

update roberta with coati

chat ci update

Revert "chat ci update"

This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.

[test]chat_update_ci

Update test_ci.sh

Update test_ci.sh

test

Update gpt_critic.py

Update gpt_critic.py

Update run_chatgpt_unit_tests.yml

update test ci

update

update

update

update

Update test_ci.sh

update

Update test_ci.sh

Update test_ci.sh

Update run_chatgpt_examples.yml

Update run_chatgpt_examples.yml
2023-04-18 14:33:12 +08:00
tingfeng cao 7788e0b0a5
fix: fix sft (#3568) 2023-04-17 16:47:44 +08:00
Fazzie-Maqianli 6b1a39b17b
[coati] add costom model suppor tguide (#3579) 2023-04-17 15:40:41 +08:00
binmakeswell cc1eec2f53
[chat] update reward model sh (#3578) 2023-04-17 15:02:55 +08:00
csric e355144375
[chatgpt] Detached PPO Training (#3195)
* run the base

* working on dist ppo

* sync

* detached trainer

* update detached trainer. no maker update function

* facing init problem

* 1 maker 1 trainer detached run. but no model update

* facing cuda problem

* fix save functions

* verified maker update

* nothing

* add ignore

* analyize loss issue

* remove some debug codes

* facing 2m1t stuck issue

* 2m1t verified

* do not use torchrun

* working on 2m2t

* working on 2m2t

* initialize strategy in ray actor env

* facing actor's init order issue

* facing ddp model update issue (need unwarp ddp)

* unwrap ddp actor

* checking 1m2t stuck problem

* nothing

* set timeout for trainer choosing. It solves the stuck problem!

* delete some debug output

* rename to sync with upstream

* rename to sync with upstream

* coati rename

* nothing

* I am going to detach the replaybuffer from trainer and make it a Ray Actor. Two benefits: 1. support TP trainer. 2. asynchronized buffer operations

* experience_maker_holder performs target-revolving _send_experience() instead of length comparison.

* move code to ray subfolder

* working on pipeline inference

* apply comments

---------

Co-authored-by: csric <richcsr256@gmail.com>
2023-04-17 14:46:50 +08:00
MisterLin1995 1a809eddaa
[chat] ChatGPT train prompts on ray example (#3309)
* [feat][chatgpt]train prompts on ray example

* [fix]simplify code

* [fix]remove depreciated parameter

* [fix]add dependencies

* [fix]method calling

* [fix]experience maker

* [fix]missing loss function

* [fix]init optimizer

* [feat]add usage comment

* [fix]rename files

* [fix]add readme

* [fix]file path

* [fix]move directory

---------

Co-authored-by: jiangwen <zxl265370@antgroup.com>
2023-04-13 18:18:36 +08:00
binmakeswell 535b896435
[chat] polish tutorial doc (#3551)
* [chat] clean up duplicate tutorial

* [chat] clean up duplicate tutorial

* [chat] clean up duplicate tutorial

* [chat] clean up duplicate tutorial
2023-04-13 18:11:48 +08:00
Yuanchen 7182ac2a04
[chat]add examples of training with limited resources in chat readme (#3536)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-12 15:47:09 +08:00
zhang-yi-chi e6a132a449
[chat]: add vf_coef argument for PPOTrainer (#3318) 2023-04-11 09:54:59 +08:00
ver217 89fd10a1c9
[chat] add zero2 cpu strategy for sft training (#3520) 2023-04-10 19:00:13 +08:00
binmakeswell 990d4c3e4e
[doc] hide diffusion in application path (#3519)
- [ ] Stable Diffusion
- [ ] Dreambooth
It's easy for users to think that we don't support them yet. Add them after migrating them from example to application
https://github.com/hpcaitech/ColossalAI/tree/main/examples/images
2023-04-10 17:52:24 +08:00
binmakeswell 0c0455700f
[doc] add requirement and highlight application (#3516)
* [doc] add requirement and highlight application

* [doc] link example and application
2023-04-10 17:37:16 +08:00
NatalieC323 635d0a1baf
[Chat Community] Update README.md (fixed#3487) (#3506)
* Update README.md

* Update README.md

* Update README.md

* Update README.md

---------

Co-authored-by: Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com>
2023-04-10 14:36:39 +08:00
gongenlei a7ca297281
[coati] Fix LlamaCritic (#3475)
* mv LlamaForCausalLM to LlamaModel

* rm unused imports

---------

Co-authored-by: gongenlei <gongenlei@baidu.com>
2023-04-07 11:39:09 +08:00
binmakeswell 891b8e7fac
[chat] fix stage3 PPO sample sh command (#3477) 2023-04-06 18:08:16 +08:00
Fazzie-Maqianli 6afeb1202a
add community example dictionary (#3465) 2023-04-06 15:04:48 +08:00
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452)
* [test] added spawn decorator

* polish code

* polish code

* polish code

* polish code

* polish code

* polish code
2023-04-06 14:51:35 +08:00
YY Lin 62f4e2eb07
[Chat]Add Peft support & fix the ptx bug (#3433)
* Update ppo.py

Fix the bug of fetching wrong batch data

* Add peft model support in SFT and Prompts training

In stage-1 and stage-3, the peft model supports are added. So the trained artifacts will be only a small lora additions instead of the whole bunch of files.

* Delete test_prompts.txt

* Delete test_pretrained.txt

* Move the peft stuffs to a community folder.

* Move the demo sft to community

* delete dirty files

* Add instructions to install peft using source

* Remove Chinese comments

* remove the Chinese comments
2023-04-06 11:54:52 +08:00
Dr-Corgi 73afb63594
[chat]fix save_model(#3377)
The function save_model should be a part of PPOTrainer.
2023-04-06 11:19:14 +08:00
kingkingofall 57a3c4db6d
[chat]fix readme (#3429)
* fix stage 2

fix stage 2

* add torch
2023-04-06 10:58:53 +08:00
Camille Zhong 72cb4dd433
[Chat] fix the tokenizer "int too big to convert" error in SFT training (#3453)
* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* update roberta with coati

* chat ci update

* Revert "chat ci update"

This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.

* [Chat] fix the tokenizer "int too big to convert" error in SFT training

fix the tokenizer error during SFT training using Bloom and OPT
2023-04-06 09:30:28 +08:00
Yuanchen b92313903f
fix save_model indent error in ppo trainer (#3450)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-05 09:45:42 +08:00
Yuanchen 773955abfa
fix save_model inin naive and ddp strategy (#3436)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-04 15:30:01 +08:00
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424)
* [zero] refactor low-level zero folder structure

* [zero] fix legacy zero import path

* [zero] fix legacy zero import path

* [zero] remove useless import

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] fix test import path

* [zero] fix test

* [zero] fix circular import

* [zero] update import
2023-04-04 13:48:16 +08:00
Yuanchen b09adff724
[chat]fix sft training for bloom, gpt and opt (#3418)
fix sft training for bloom, gpt and opt
2023-04-04 09:46:23 +08:00
Camille Zhong 30412866e0
[chatgpt] add pre-trained model RoBERTa for RLHF stage 2 & 3 (#3223)
* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* add test for reward model training

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* update roberta with coati
2023-04-03 10:11:03 +08:00
Andrew 82132f4e3d
[chat] correcting a few obvious typos and grammars errors (#3338) 2023-03-30 14:18:37 +08:00
Fazzie-Maqianli 0fbadce79c
[doc] added authors to the chat application (#3307) 2023-03-29 11:04:30 +08:00
BlueRum b512893637
Polish readme link (#3306) 2023-03-29 10:25:50 +08:00
github-actions[bot] cb413ccf28
[format] applied code formatting on changed files in pull request 3300 (#3302)
Co-authored-by: github-actions <github-actions@github.com>
2023-03-29 09:28:24 +08:00
binmakeswell 31c78f2be3
[doc] add ColossalChat news (#3304)
* [doc] add ColossalChat news

* [doc] add ColossalChat news
2023-03-29 09:27:55 +08:00
Frank Lee e235a24673
[application] updated the README (#3301)
* [application] updated the README

* polish code
2023-03-29 08:47:00 +08:00
BlueRum 8257e1055d
[chat]polish prompts training (#3300)
* polish train_prompts

* polish readme
2023-03-29 08:44:16 +08:00
ver217 62f7156131
[coati] fix inference profanity check (#3299) 2023-03-29 04:26:35 +08:00
github-actions[bot] 5134ad5d1a
[format] applied code formatting on changed files in pull request 3296 (#3298)
Co-authored-by: github-actions <github-actions@github.com>
2023-03-29 02:35:40 +08:00
BlueRum c8b723d6c2
[chat]Update Readme (#3296)
* Update README.md

* Update README.md

* Update README.md

* update example readme
2023-03-29 02:32:17 +08:00
ver217 73b542a124
[coati] inference supports profanity check (#3295) 2023-03-29 02:14:35 +08:00
ver217 ce2cafae76
[coati] add repetition_penalty for inference (#3294) 2023-03-29 01:18:45 +08:00
Fazzie-Maqianli a88ed0f83a
add limit (#3293) 2023-03-29 00:53:23 +08:00
Fazzie-Maqianli c5484281aa
[ColossalChat]add cite for datasets (#3292) 2023-03-29 00:38:36 +08:00
Fazzie-Maqianli ec7af22a43
fix image (#3288) 2023-03-28 23:34:21 +08:00
Fazzie-Maqianli 1f7d9afbf8
add example (#3286) 2023-03-28 23:07:15 +08:00
ver217 4905b21b94
[coati] fix inference output (#3285)
* [coati] fix inference requirements

* [coati] add output postprocess

* [coati] update inference readme

* [coati] fix inference requirements
2023-03-28 21:20:28 +08:00
Fazzie-Maqianli bb6196e71a
remove chatgpt (#3284) 2023-03-28 20:29:09 +08:00
Fazzie-Maqianli b0ce5a1032
[Coati] first commit (#3283) 2023-03-28 20:25:36 +08:00
binmakeswell d32ef94ad9
[doc] fix typo (#3222)
* [doc] fix typo

* [doc] fix typo
2023-03-24 13:33:35 +08:00
ver217 78fd31f9c1
[chatgpt] add precision option for colossalai (#3233) 2023-03-24 12:15:06 +08:00
Fazzie-Maqianli bd39877da4
support instrcut training (#3230) 2023-03-24 11:45:01 +08:00
Camille Zhong 9bc702ab48
[doc] update chatgpt doc paper link (#3229)
#issue 3189
2023-03-24 11:21:39 +08:00
Fazzie-Maqianli bbac6760e5
fix torch version (#3225) 2023-03-23 20:56:35 +08:00
Fazzie-Maqianli fa97a9cab4
[chatgpt] unnify datasets (#3218) 2023-03-23 17:38:30 +08:00
Fazzie-Maqianli 4fd4bd9d9a
[chatgpt] support instuct training (#3216) 2023-03-23 16:46:20 +08:00
Yuanchen 9998d5ef64
[chatgpt]add reward model code for deberta (#3199)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-03-22 19:09:39 +08:00
Fazzie-Maqianli 1e1b9d2fea
[chatgpt]support llama (#3070) 2023-03-22 15:44:31 +08:00
pgzhang b429529365
[chatgpt] add supervised learning fine-tune code (#3183)
* [chatgpt] add supervised fine-tune code

* [chatgpt] delete unused code and modified comment code

* [chatgpt] use pytorch distributed sampler instead

---------

Co-authored-by: zhangpengpeng <zhangpengpeng@joyy.com>
2023-03-22 09:59:42 +08:00
BlueRum 7548ca5a54
[chatgpt]Reward Model Training Process update (#3133)
* add normalize function to value_head in bloom rm

* add normalization to value_function in gpt_rm

* add normalization to value_head of opt_rm

* add Anthropic/hh-rlhf dataset

* Update __init__.py

* Add LogExpLoss in RM training

* Update __init__.py

* update rm trainer to use acc as target

* update example/train_rm

* Update train_rm.sh

* code style

* Update README.md

* Update README.md

* add rm test to ci

* fix tokenier

* fix typo

* change batchsize to avoid oom in ci

* Update test_ci.sh
2023-03-20 09:59:06 +08:00
ver217 1e58d31bb7
[chatgpt] fix trainer generate kwargs (#3166) 2023-03-17 17:31:22 +08:00