Commit Graph

5 Commits (4fd4bd9d9a88bde184d347a4b283b117e5025630)

Author SHA1 Message Date
Fazzie-Maqianli 4fd4bd9d9a
[chatgpt] support instuct training (#3216) 2023-03-23 16:46:20 +08:00
pgzhang b429529365
[chatgpt] add supervised learning fine-tune code (#3183)
* [chatgpt] add supervised fine-tune code

* [chatgpt] delete unused code and modified comment code

* [chatgpt] use pytorch distributed sampler instead

---------

Co-authored-by: zhangpengpeng <zhangpengpeng@joyy.com>
2023-03-22 09:59:42 +08:00
BlueRum 7548ca5a54
[chatgpt]Reward Model Training Process update (#3133)
* add normalize function to value_head in bloom rm

* add normalization to value_function in gpt_rm

* add normalization to value_head of opt_rm

* add Anthropic/hh-rlhf dataset

* Update __init__.py

* Add LogExpLoss in RM training

* Update __init__.py

* update rm trainer to use acc as target

* update example/train_rm

* Update train_rm.sh

* code style

* Update README.md

* Update README.md

* add rm test to ci

* fix tokenier

* fix typo

* change batchsize to avoid oom in ci

* Update test_ci.sh
2023-03-20 09:59:06 +08:00
BlueRum 3eebc4dff7
[chatgpt] fix rm eval (#2829)
* [chatgpt]fix train_rm bug with lora

* [chatgpt]support colossalai strategy to train rm

* fix pre-commit

* fix pre-commit 2

* [chatgpt]fix rm eval typo

* fix rm eval

* fix pre commit
2023-02-21 11:35:45 +08:00
ver217 1b34701027
[app] add chatgpt application (#2698) 2023-02-14 22:17:25 +08:00