ColossalAI/applications/ColossalChat/coati/trainer
YeAnbang d888c3787c add benchmark for sft, dpo, simpo, orpo. Add benchmarking result. Support lora with gradient checkpoint 2024-07-10 10:17:08 +00:00
..
callbacks [ColossalChat] Update RLHF V2 (#5286) 2024-03-29 14:12:29 +08:00
__init__.py add orpo 2024-06-27 07:20:28 +00:00
base.py [ColossalChat] Update RLHF V2 (#5286) 2024-03-29 14:12:29 +08:00
dpo.py add benchmark for sft, dpo, simpo, orpo. Add benchmarking result. Support lora with gradient checkpoint 2024-07-10 10:17:08 +00:00
orpo.py add benchmark for sft, dpo, simpo, orpo. Add benchmarking result. Support lora with gradient checkpoint 2024-07-10 10:17:08 +00:00
ppo.py [ColossalChat] Update RLHF V2 (#5286) 2024-03-29 14:12:29 +08:00
rm.py [ColossalChat] Update RLHF V2 (#5286) 2024-03-29 14:12:29 +08:00
sft.py add SimPO 2024-06-24 02:12:20 +00:00
utils.py [pre-commit.ci] pre-commit autoupdate (#5572) 2024-07-01 17:16:41 +08:00