mirror of https://github.com/InternLM/InternLM
fix/fix_submodule_err
parent
982a3d6813
commit
3652f9783c
|
@ -192,7 +192,7 @@ $ srun -p internllm -N 2 -n 16 --ntasks-per-node=8 --gpus-per-task=1 python trai
|
|||
If you want to start distributed training on torch with 8 GPUs on a single node, use the following command:
|
||||
|
||||
```bash
|
||||
$ torchrun --nnodes=1 --nproc_per_node=8 train.py --config ./configs/7B_sft.py
|
||||
$ torchrun --nnodes=1 --nproc_per_node=8 train.py --config ./configs/7B_sft.py --launcher "torch"
|
||||
```
|
||||
|
||||
### Training Results
|
||||
|
|
|
@ -175,7 +175,7 @@ $ srun -p internllm -N 2 -n 16 --ntasks-per-node=8 --gpus-per-task=1 python trai
|
|||
|
||||
若在 torch 上启动分布式运行环境,单节点 8 卡的运行命令如下所示:
|
||||
```bash
|
||||
$ torchrun --nnodes=1 --nproc_per_node=8 train.py --config ./configs/7B_sft.py
|
||||
$ torchrun --nnodes=1 --nproc_per_node=8 train.py --config ./configs/7B_sft.py --launcher "torch"
|
||||
```
|
||||
|
||||
### 运行结果
|
||||
|
|
Loading…
Reference in New Issue