mirror of https://github.com/hpcaitech/ColossalAI
24 lines
1.1 KiB
Markdown
24 lines
1.1 KiB
Markdown
# Pretraining
|
|
1. Pretraining roberta through running the script below. Detailed parameter descriptions can be found in the arguments.py. `data_path_prefix` is absolute path specifies output of preprocessing. **You have to modify the *hostfile* according to your cluster.**
|
|
|
|
```bash
|
|
bash run_pretrain.sh
|
|
```
|
|
* `--hostfile`: servers' host name from /etc/hosts
|
|
* `--include`: servers which will be used
|
|
* `--nproc_per_node`: number of process(GPU) from each server
|
|
* `--data_path_prefix`: absolute location of train data, e.g., /h5/0.h5
|
|
* `--eval_data_path_prefix`: absolute location of eval data
|
|
* `--tokenizer_path`: tokenizer path contains huggingface tokenizer.json, e.g./tokenizer/tokenizer.json
|
|
* `--bert_config`: config.json which represent model
|
|
* `--mlm`: model type of backbone, bert or deberta_v2
|
|
|
|
2. if resume training from earlier checkpoint, run the script below.
|
|
|
|
```shell
|
|
bash run_pretrain_resume.sh
|
|
```
|
|
* `--resume_train`: whether to resume training
|
|
* `--load_pretrain_model`: absolute path which contains model checkpoint
|
|
* `--load_optimizer_lr`: absolute path which contains optimizer checkpoint
|