ColossalAI/applications/ColossalChat/examples/training_scripts
YeAnbang 09d5ffca1a add kto 2024-07-18 07:54:11 +00:00
..
hostfile add SimPO 2024-06-24 02:12:20 +00:00
train_dpo.py fix orpo cross entropy loss 2024-07-15 02:12:05 +00:00
train_dpo.sh add orpo 2024-06-27 07:20:28 +00:00
train_kto.py add kto 2024-07-18 07:54:11 +00:00
train_kto.sh add kto 2024-07-18 07:54:11 +00:00
train_orpo.py fix orpo cross entropy loss 2024-07-15 02:12:05 +00:00
train_orpo.sh add orpo 2024-06-27 07:20:28 +00:00
train_ppo.py replace the customized dataloader setup with the build-in one 2024-06-07 09:43:42 +00:00
train_ppo.sh [ColossalChat] Update RLHF V2 (#5286) 2024-03-29 14:12:29 +08:00
train_rm.py fix orpo cross entropy loss 2024-07-15 02:12:05 +00:00
train_rm.sh add kto 2024-07-18 07:54:11 +00:00
train_sft.py fix orpo cross entropy loss 2024-07-15 02:12:05 +00:00
train_sft.sh add kto 2024-07-18 07:54:11 +00:00