* feat: remove on_learn_epoch fn as not used
* revert: add _on_learn_epoch fn
* feat: remove NaiveStrategy
* test: update train_prompts tests
* fix: remove prepare_llama_tokenizer_and_embedding
* test: add lora arg
* feat: remove roberta support in train_prompts due to runtime errs
* feat: remove deberta & roberta in rm as not used
* test: remove deberta and roberta tests
* feat: remove deberta and roberta models as not used
* fix: remove calls to roberta
* fix: remove prepare_llama_tokenizer_and_embedding
* chore: update transformers version
* docs: update transformers version
* fix: fix actor inference
* fix: fix ci
* feat: change llama pad token to unk
* revert: revert ddp setup_distributed
* fix: change llama pad token to unk
* revert: undo unnecessary changes
* fix: use pip to install transformers
* refactor: adapt boost API in base and naive strategies
* fix: initialize plugin after setup_distributed
* fix: fix save_pretrained fn
* refactor: adapt boost API in DDPStrategy
* to: add _post_init check
* to: fix ddp backward, modify ddp dataloader and unwrap
* feat: adapt boost API in ColossalAIStrategy
* fix: call setup_distributed before use get_current_device
* fix: fix save_model and save_optimizer
* test: remove save_sharded_optimizer test
* style: apply formatter
* fix: fix stage check and add comments
* feat: allow dict type arg in strategy.prepare
* to: temporarily remove lr_scheduler for testing
* style: simplify init of ColossalAIStrategy
* fix: fix lr_scheduler in sft and rm
* style: modify comments
* test: add train_prompts tests
* fix: fix inference only case and use in train_prompts
* test: skip failed tests in ci
* style: fix CodeFactor check
* fix: do not use model.to('cpu') with GeminiPlugin
* test: enable colossalai_gemini tests
* test: set CUDA_VISIBLE_DEVICES in ci
* docs: add note
* refactor: separate log_probs fn from Actor forward fn
* refactor: separate generate fn from Actor class
* feat: update unwrap_model and get_base_model
* unwrap_model returns model not wrapped by Strategy
* get_base_model returns HF model for Actor, Critic and RewardModel
* feat: simplify Strategy.prepare
* style: remove get_base_model method of Actor
* perf: tokenize text in batches
* refactor: move calc_action_log_probs to utils of model
* test: update test with new forward fn
* style: rename forward fn args
* fix: do not unwrap model in save_model fn of naive strategy
* test: add gemini test for train_prompts
* fix: fix _set_default_generate_kwargs
* fix a bug when the config file contains one category but the answer file doesn't contains that category
* fix Chinese prompt file
* support gpt-3.5-turbo and gpt-4 evaluation
* polish and update README
* resolve pr comments
---------
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
* Add RoBERTa for RLHF Stage 2 & 3 (test)
RoBERTa for RLHF Stage 2 & 3 (still in testing)
Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
This reverts commit 06741d894d.
Add RoBERTa for RLHF stage 2 & 3
1. add roberta folder under model folder
2. add roberta option in train_reward_model.py
3. add some test in testci
Update test_ci.sh
Revert "Update test_ci.sh"
This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
Add RoBERTa for RLHF Stage 2 & 3 (test)
RoBERTa for RLHF Stage 2 & 3 (still in testing)
Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
This reverts commit 06741d894d.
Add RoBERTa for RLHF stage 2 & 3
1. add roberta folder under model folder
2. add roberta option in train_reward_model.py
3. add some test in testci
Update test_ci.sh
Revert "Update test_ci.sh"
This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
update roberta with coati
chat ci update
Revert "chat ci update"
This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
* Update README.md
Update README.md
* update readme
* Update test_ci.sh
* update readme and add a script
update readme and add a script
modify readme
Update README.md
* [chat] strategy refactor unwrap model
* [chat] strategy refactor save model
* [chat] add docstr
* [chat] refactor trainer save model
* [chat] fix strategy typing
* [chat] refactor trainer save model
* [chat] update readme
* [chat] fix unit test
* Add RoBERTa for RLHF Stage 2 & 3 (test)
RoBERTa for RLHF Stage 2 & 3 (still in testing)
Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
This reverts commit 06741d894d.
Add RoBERTa for RLHF stage 2 & 3
1. add roberta folder under model folder
2. add roberta option in train_reward_model.py
3. add some test in testci
Update test_ci.sh
Revert "Update test_ci.sh"
This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
Add RoBERTa for RLHF Stage 2 & 3 (test)
RoBERTa for RLHF Stage 2 & 3 (still in testing)
Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
This reverts commit 06741d894d.
Add RoBERTa for RLHF stage 2 & 3
1. add roberta folder under model folder
2. add roberta option in train_reward_model.py
3. add some test in testci
Update test_ci.sh
Revert "Update test_ci.sh"
This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
update roberta with coati
chat ci update
Revert "chat ci update"
This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
* Update README.md
Update README.md
* update readme
* Update test_ci.sh