ColossalAI

Author	SHA1	Message	Date
Yuanchen	9998d5ef64	[chatgpt]add reward model code for deberta (#3199 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	2023-03-22 19:09:39 +08:00
BlueRum	7548ca5a54	[chatgpt]Reward Model Training Process update (#3133 ) * add normalize function to value_head in bloom rm * add normalization to value_function in gpt_rm * add normalization to value_head of opt_rm * add Anthropic/hh-rlhf dataset * Update __init__.py * Add LogExpLoss in RM training * Update __init__.py * update rm trainer to use acc as target * update example/train_rm * Update train_rm.sh * code style * Update README.md * Update README.md * add rm test to ci * fix tokenier * fix typo * change batchsize to avoid oom in ci * Update test_ci.sh	2023-03-20 09:59:06 +08:00
BlueRum	2e16f842a9	[chatgpt]support opt & gpt for rm training (#2876 )	2023-02-22 16:58:11 +08:00
BlueRum	613efebc5c	[chatgpt] support colossalai strategy to train rm (#2742 ) * [chatgpt]fix train_rm bug with lora * [chatgpt]support colossalai strategy to train rm * fix pre-commit * fix pre-commit 2	2023-02-16 11:24:07 +08:00
ver217	1b34701027	[app] add chatgpt application (#2698 )	2023-02-14 22:17:25 +08:00