2022-11-08 02:58:17 +00:00
|
|
|
## Overview
|
2022-11-16 03:36:27 +00:00
|
|
|
This example shows how to use Colossal-AI to run huggingface GPT training in distributed manners.
|
2022-11-08 02:58:17 +00:00
|
|
|
|
2022-11-08 08:14:07 +00:00
|
|
|
## GPT
|
2022-11-16 03:36:27 +00:00
|
|
|
We use the GPT2 model from huggingface transformers. The input data is randonly generated.
|
2022-11-16 06:44:28 +00:00
|
|
|
The `train_gpt_demo.py` provides three distributed plans, i.e. ColossalAI, PyTorch DDP and ZeRO.
|
|
|
|
The ColossalAI leverages Tensor Parallel and Gemini.
|
2022-11-08 02:58:17 +00:00
|
|
|
|
2022-11-08 08:14:07 +00:00
|
|
|
## Quick Start
|
2022-11-16 03:36:27 +00:00
|
|
|
You can launch training by using the following bash script.
|
2022-11-08 02:58:17 +00:00
|
|
|
|
|
|
|
```bash
|
2022-11-08 08:14:07 +00:00
|
|
|
pip install -r requirements.txt
|
|
|
|
bash run.sh
|
2022-11-08 02:58:17 +00:00
|
|
|
```
|