ColossalAI/examples/community/roberta/pretraining/README.md

# Pretraining
1. Pretraining roberta through running the script below. Detailed parameter descriptions can be found in the arguments.py. `data_path_prefix` is absolute path specifies output of preprocessing. **You have to modify the *hostfile* according to your cluster.**

```bash
bash run_pretrain.sh
```
* `--hostfile`: servers' host name from /etc/hosts
* `--include`: servers which will be used
* `--nproc_per_node`: number of process(GPU) from each server
* `--data_path_prefix`: absolute location of train data, e.g., /h5/0.h5
* `--eval_data_path_prefix`: absolute location of eval data
* `--tokenizer_path`: tokenizer path contains huggingface tokenizer.json, e.g./tokenizer/tokenizer.json
* `--bert_config`: config.json which represent model
* `--mlm`: model type of backbone, bert or deberta_v2

2. if resume training from earlier checkpoint, run the script below.

```shell
bash run_pretrain_resume.sh
```
* `--resume_train`: whether to resume training
* `--load_pretrain_model`: absolute path which contains model checkpoint
* `--load_optimizer_lr`: absolute path which contains optimizer checkpoint
add RoBERTa (#1980) * update roberta * update roberta & readme * update roberta & readme * update roberta & readme 2022-11-18 06:04:49 +00:00			`# Pretraining`
			1. Pretraining roberta through running the script below. Detailed parameter descriptions can be found in the arguments.py. `data_path_prefix` is absolute path specifies output of preprocessing. *You have to modify the hostfile* according to your cluster.**

			```bash
			`bash run_pretrain.sh`
			```
			* `--hostfile`: servers' host name from /etc/hosts
			* `--include`: servers which will be used
			* `--nproc_per_node`: number of process(GPU) from each server
			* `--data_path_prefix`: absolute location of train data, e.g., /h5/0.h5
			* `--eval_data_path_prefix`: absolute location of eval data
			* `--tokenizer_path`: tokenizer path contains huggingface tokenizer.json, e.g./tokenizer/tokenizer.json
			* `--bert_config`: config.json which represent model
			* `--mlm`: model type of backbone, bert or deberta_v2

fix typo examples/community/roberta (#3925) 2023-06-08 06:28:34 +00:00			`2. if resume training from earlier checkpoint, run the script below.`
add RoBERTa (#1980) * update roberta * update roberta & readme * update roberta & readme * update roberta & readme 2022-11-18 06:04:49 +00:00
			```shell
			`bash run_pretrain_resume.sh`
			```
			* `--resume_train`: whether to resume training
[example] reorganize for community examples (#3557) 2023-04-14 08:27:48 +00:00			* `--load_pretrain_model`: absolute path which contains model checkpoint
			* `--load_optimizer_lr`: absolute path which contains optimizer checkpoint