diff --git a/applications/ChatGPT/README.md b/applications/ChatGPT/README.md index dce59ad4b..43085f3ab 100644 --- a/applications/ChatGPT/README.md +++ b/applications/ChatGPT/README.md @@ -1,6 +1,6 @@ -# RLHF - ColossalAI +# RLHF - Colossal-AI -Implementation of RLHF (Reinforcement Learning with Human Feedback) powered by ColossalAI. It supports distributed training and offloading, which can fit extremly large models. +Implementation of RLHF (Reinforcement Learning with Human Feedback) powered by Colossal-AI. It supports distributed training and offloading, which can fit extremly large models. More details can be found in the [blog](https://www.hpc-ai.tech/blog/colossal-ai-chatgpt).
@@ -60,6 +60,27 @@ We also support training reward model with true-world data. See `examples/train_
- [ ] integrate with Ray
- [ ] support more RL paradigms, like Implicit Language Q-Learning (ILQL)
+## Quick Preview
+
+
+
+
+
+
+