Commit Graph

3 Commits (ffd3878a1ecaba925de4e64eb41e89bf0dfafd70)

Author SHA1 Message Date
Tong Li ffd3878a1e add simple grpo 2025-02-23 22:54:26 +08:00
Hongxin Liu de282dd694
[feature] fit RL style generation (#6213)
* [feature] fit rl style generation

* [doc] add docstr

* [doc] add docstr
2025-02-21 17:28:19 +08:00
Hongxin Liu 43c9b5fb44
[chat] add distributed impl (#6210) 2025-02-21 15:24:23 +08:00