add example (#3286)
|
@ -18,6 +18,8 @@
|
||||||
- [Stage2 - Training reward model](#stage2---training-reward-model)
|
- [Stage2 - Training reward model](#stage2---training-reward-model)
|
||||||
- [Stage3 - Training model with reinforcement learning by human feedback](#stage3---training-model-with-reinforcement-learning-by-human-feedback)
|
- [Stage3 - Training model with reinforcement learning by human feedback](#stage3---training-model-with-reinforcement-learning-by-human-feedback)
|
||||||
- [Coati7B examples](#coati7b-examples)
|
- [Coati7B examples](#coati7b-examples)
|
||||||
|
- [Generation](#generation)
|
||||||
|
- [Open QA](#open-qa)
|
||||||
- [FAQ](#faq)
|
- [FAQ](#faq)
|
||||||
- [How to save/load checkpoint](#how-to-saveload-checkpoint)
|
- [How to save/load checkpoint](#how-to-saveload-checkpoint)
|
||||||
- [The Plan](#the-plan)
|
- [The Plan](#the-plan)
|
||||||
|
@ -77,6 +79,7 @@ pip install .
|
||||||
### Supervised datasets collection
|
### Supervised datasets collection
|
||||||
|
|
||||||
we colllected 104K bilingual dataset of Chinese and English, and you can find the datasets in this repo
|
we colllected 104K bilingual dataset of Chinese and English, and you can find the datasets in this repo
|
||||||
|
[InstructionWild](https://github.com/XueFuzhao/InstructionWild)
|
||||||
|
|
||||||
Here is how we collected the data
|
Here is how we collected the data
|
||||||
<p align="center">
|
<p align="center">
|
||||||
|
@ -143,6 +146,73 @@ We also support training reward model with true-world data. See `examples/train_
|
||||||
|
|
||||||
## Coati7B examples
|
## Coati7B examples
|
||||||
|
|
||||||
|
### Generation
|
||||||
|
|
||||||
|
<details><summary><b>E-mail</b></summary>
|
||||||
|
|
||||||
|
![phd](assets/Phd.png)
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details><summary><b>coding</b></summary>
|
||||||
|
|
||||||
|
![sort](assets/quick_sort.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details><summary><b>regex</b></summary>
|
||||||
|
|
||||||
|
![regex](assets/regex.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details><summary><b>Tex</b></summary>
|
||||||
|
|
||||||
|
![tex](assets/tex.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details><summary><b>writing</b></summary>
|
||||||
|
|
||||||
|
![writing](assets/writing.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details><summary><b>Table</b></summary>
|
||||||
|
|
||||||
|
![Table](assets/table.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
### Open QA
|
||||||
|
<details><summary><b>Game</b></summary>
|
||||||
|
|
||||||
|
![Game](assets/game.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details><summary><b>Travel</b></summary>
|
||||||
|
|
||||||
|
![Travel](assets/travel.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details><summary><b>Physical</b></summary>
|
||||||
|
|
||||||
|
![Physical](assets/Physical.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details><summary><b>Chemical</b></summary>
|
||||||
|
|
||||||
|
![Chemical](assets/chemical.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details><summary><b>Economy</b></summary>
|
||||||
|
|
||||||
|
![Economy](assets/economy.png)
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
## FAQ
|
## FAQ
|
||||||
|
|
||||||
|
|
After Width: | Height: | Size: 273 KiB |
After Width: | Height: | Size: 307 KiB |
After Width: | Height: | Size: 390 KiB |
After Width: | Height: | Size: 403 KiB |
After Width: | Height: | Size: 171 KiB |
After Width: | Height: | Size: 173 KiB |
After Width: | Height: | Size: 40 KiB |
After Width: | Height: | Size: 116 KiB |
After Width: | Height: | Size: 284 KiB |
After Width: | Height: | Size: 230 KiB |
After Width: | Height: | Size: 229 KiB |