You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Wenhao Chen
7b9b86441f
[chat]: update rm, add wandb and fix bugs (#4471)
* feat: modify forward fn of critic and reward model
* feat: modify calc_action_log_probs
* to: add wandb in sft and rm trainer
* feat: update train_sft
* feat: update train_rm
* style: modify type annotation and add warning
* feat: pass tokenizer to ppo trainer
* to: modify trainer base and maker base
* feat: add wandb in ppo trainer
* feat: pass tokenizer to generate
* test: update generate fn tests
* test: update train tests
* fix: remove action_mask
* feat: remove unused code
* fix: fix wrong ignore_index
* fix: fix mock tokenizer
* chore: update requirements
* revert: modify make_experience
* fix: fix inference
* fix: add padding side
* style: modify _on_learn_batch_end
* test: use mock tokenizer
* fix: use bf16 to avoid overflow
* fix: fix workflow
* [chat] fix gemini strategy
* [chat] fix
* sync: update colossalai strategy
* fix: fix args and model dtype
* fix: fix checkpoint test
* fix: fix requirements
* fix: fix missing import and wrong arg
* fix: temporarily skip gemini test in stage 3
* style: apply pre-commit
* fix: temporarily skip gemini test in stage 1&2
---------
Co-authored-by: Mingyan Jiang <1829166702@qq.com>
|
1 year ago |
.. |
__init__.py
|
[misc] update pre-commit and run all files (#4752)
|
1 year ago |
base.py
|
[chat]: update rm, add wandb and fix bugs (#4471)
|
1 year ago |
naive.py
|
[chat]: update rm, add wandb and fix bugs (#4471)
|
1 year ago |