* Add RoBERTa for RLHF Stage 2 & 3 (test)
RoBERTa for RLHF Stage 2 & 3 (still in testing)
Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
This reverts commit 06741d894d.
Add RoBERTa for RLHF stage 2 & 3
1. add roberta folder under model folder
2. add roberta option in train_reward_model.py
3. add some test in testci
Update test_ci.sh
Revert "Update test_ci.sh"
This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
Add RoBERTa for RLHF Stage 2 & 3 (test)
RoBERTa for RLHF Stage 2 & 3 (still in testing)
Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
This reverts commit 06741d894d.
Add RoBERTa for RLHF stage 2 & 3
1. add roberta folder under model folder
2. add roberta option in train_reward_model.py
3. add some test in testci
Update test_ci.sh
Revert "Update test_ci.sh"
This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
update roberta with coati
chat ci update
Revert "chat ci update"
This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
* Update README.md
Update README.md
* update readme
* Update test_ci.sh
* update readme and add a script
update readme and add a script
modify readme
Update README.md
* [chat] strategy refactor unwrap model
* [chat] strategy refactor save model
* [chat] add docstr
* [chat] refactor trainer save model
* [chat] fix strategy typing
* [chat] refactor trainer save model
* [chat] update readme
* [chat] fix unit test
* Add RoBERTa for RLHF Stage 2 & 3 (test)
RoBERTa for RLHF Stage 2 & 3 (still in testing)
Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
This reverts commit 06741d894d.
Add RoBERTa for RLHF stage 2 & 3
1. add roberta folder under model folder
2. add roberta option in train_reward_model.py
3. add some test in testci
Update test_ci.sh
Revert "Update test_ci.sh"
This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
Add RoBERTa for RLHF Stage 2 & 3 (test)
RoBERTa for RLHF Stage 2 & 3 (still in testing)
Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
This reverts commit 06741d894d.
Add RoBERTa for RLHF stage 2 & 3
1. add roberta folder under model folder
2. add roberta option in train_reward_model.py
3. add some test in testci
Update test_ci.sh
Revert "Update test_ci.sh"
This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
update roberta with coati
chat ci update
Revert "chat ci update"
This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
* Update README.md
Update README.md
* update readme
* Update test_ci.sh