[doc] fix typo (#3222)

* [doc] fix typo

* [doc] fix typo
pull/3194/head
binmakeswell 2023-03-24 13:33:35 +08:00 committed by GitHub
parent 045afa3ea2
commit d32ef94ad9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 3 additions and 3 deletions

View File

@ -80,7 +80,7 @@
</li>
<li><a href="#Use-Docker">Use Docker</a></li>
<li><a href="#Community">Community</a></li>
<li><a href="#contributing">Contributing</a></li>
<li><a href="#Contributing">Contributing</a></li>
<li><a href="#Cite-Us">Cite Us</a></li>
</ul>
@ -375,7 +375,7 @@ Join the Colossal-AI community on [Forum](https://github.com/hpcaitech/ColossalA
[Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w),
and [WeChat(微信)](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png "qrcode") to share your suggestions, feedback, and questions with our engineering team.
## Invitation to open-source contribution
## Contributing
Referring to the successful attempts of [BLOOM](https://bigscience.huggingface.co/) and [Stable Diffusion](https://en.wikipedia.org/wiki/Stable_Diffusion), any and all developers and partners with computing powers, datasets, models are welcome to join and build the Colossal-AI community, making efforts towards the era of big AI models!
You may contact us or participate in the following ways:

View File

@ -16,7 +16,7 @@ torchrun --standalone --nproc_per_node=2 train_reward_model.py --pretrain "faceb
```
### Features and tricks in RM training
- We support [Anthropic/hh-rlhf](https://huggingface.co/datasets/Anthropic/hh-rlhf)and[rm-static](https://huggingface.co/datasets/Dahoas/rm-static) datasets.
- We support [Anthropic/hh-rlhf](https://huggingface.co/datasets/Anthropic/hh-rlhf) and [rm-static](https://huggingface.co/datasets/Dahoas/rm-static) datasets.
- We support 2 kinds of loss_function named 'log_sig'(used by OpenAI) and 'log_exp'(used by Anthropic).
- We change the loss to valid_acc and pair_dist to monitor progress during training.
- We add special token to the end of the sequence to get better result.