Commit Graph

11 Commits (d329c294ec3e0139b603a490d344e993aeb6bfb9)

Author SHA1 Message Date
zhang-yi-chi e6a132a449
[chat]: add vf_coef argument for PPOTrainer (#3318) 2023-04-11 09:54:59 +08:00
gongenlei a7ca297281
[coati] Fix LlamaCritic (#3475)
* mv LlamaForCausalLM to LlamaModel

* rm unused imports

---------

Co-authored-by: gongenlei <gongenlei@baidu.com>
2023-04-07 11:39:09 +08:00
YY Lin 62f4e2eb07
[Chat]Add Peft support & fix the ptx bug (#3433)
* Update ppo.py

Fix the bug of fetching wrong batch data

* Add peft model support in SFT and Prompts training

In stage-1 and stage-3, the peft model supports are added. So the trained artifacts will be only a small lora additions instead of the whole bunch of files.

* Delete test_prompts.txt

* Delete test_pretrained.txt

* Move the peft stuffs to a community folder.

* Move the demo sft to community

* delete dirty files

* Add instructions to install peft using source

* Remove Chinese comments

* remove the Chinese comments
2023-04-06 11:54:52 +08:00
Dr-Corgi 73afb63594
[chat]fix save_model(#3377)
The function save_model should be a part of PPOTrainer.
2023-04-06 11:19:14 +08:00
Camille Zhong 72cb4dd433
[Chat] fix the tokenizer "int too big to convert" error in SFT training (#3453)
* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* update roberta with coati

* chat ci update

* Revert "chat ci update"

This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.

* [Chat] fix the tokenizer "int too big to convert" error in SFT training

fix the tokenizer error during SFT training using Bloom and OPT
2023-04-06 09:30:28 +08:00
Yuanchen b92313903f
fix save_model indent error in ppo trainer (#3450)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-05 09:45:42 +08:00
Yuanchen 773955abfa
fix save_model inin naive and ddp strategy (#3436)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-04-04 15:30:01 +08:00
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424)
* [zero] refactor low-level zero folder structure

* [zero] fix legacy zero import path

* [zero] fix legacy zero import path

* [zero] remove useless import

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] fix test import path

* [zero] fix test

* [zero] fix circular import

* [zero] update import
2023-04-04 13:48:16 +08:00
Yuanchen b09adff724
[chat]fix sft training for bloom, gpt and opt (#3418)
fix sft training for bloom, gpt and opt
2023-04-04 09:46:23 +08:00
Camille Zhong 30412866e0
[chatgpt] add pre-trained model RoBERTa for RLHF stage 2 & 3 (#3223)
* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* add test for reward model training

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894d.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* update roberta with coati
2023-04-03 10:11:03 +08:00
Fazzie-Maqianli b0ce5a1032
[Coati] first commit (#3283) 2023-03-28 20:25:36 +08:00