mirror of https://github.com/hpcaitech/ColossalAI
update experimental visualization (#253)
parent
753035edd3
commit
3312d716a0
|
@ -34,21 +34,27 @@ Colossal-AI为您提供了一系列并行训练组件。我们的目标是让您
|
|||
## 样例
|
||||
### ViT
|
||||
|
||||
<img src="./docs/images/ViT_TP.png" width="400" />
|
||||
<img src="./docs/images/update/vit.png" width="450" />
|
||||
|
||||
|
||||
- 14倍批大小
|
||||
- 5倍训练速度
|
||||
|
||||
### GPT-3 & GPT-2
|
||||
### GPT-3
|
||||
<img src="./docs/images/allinone/GPT3_allin1.png" width=700/>
|
||||
|
||||
|
||||
data:image/s3,"s3://crabby-images/c4814/c481454656794799d65d42388e7648c050ed7725" alt="GPT_2_3"
|
||||
|
||||
- GPT-3:释放 50% GPU 资源占用, 或 10.7% 加速
|
||||
|
||||
### GPT-2
|
||||
<img src="./docs/images/allinone/GPT2_allin1.png" width=800/>
|
||||
|
||||
- GPT-2:降低11倍GPU显存占用,或超线性扩展
|
||||
|
||||
### BERT
|
||||
<img src="./docs/images/allinone/BERT_allin1.png" width=800/>
|
||||
|
||||
data:image/s3,"s3://crabby-images/1547e/1547ea066a63d3c90eff13419035d1e818860468" alt="BERT_seq"
|
||||
|
||||
- 2倍训练速度
|
||||
- 1.5倍序列长度
|
||||
|
|
16
README.md
16
README.md
|
@ -37,21 +37,27 @@ distributed training in a few lines.
|
|||
## Examples
|
||||
### ViT
|
||||
|
||||
<img src="./docs/images/ViT_TP.png" width="400" />
|
||||
<img src="./docs/images/update/vit.png" width="450" />
|
||||
|
||||
- 14x larger batch size
|
||||
- 5x faster training
|
||||
|
||||
### GPT-3 & GPT-2
|
||||
### GPT-3
|
||||
|
||||
data:image/s3,"s3://crabby-images/c4814/c481454656794799d65d42388e7648c050ed7725" alt="GPT_2_3"
|
||||
<img src="./docs/images/allinone/GPT3_allin1.png" width=700/>
|
||||
|
||||
- Free 50% GPU resources, or 10.7% acceleration for GPT-3
|
||||
|
||||
### GPT-2
|
||||
<img src="./docs/images/allinone/GPT2_allin1.png" width=800/>
|
||||
|
||||
- 11x lower GPU RAM, or superlinear scaling for GPT-2
|
||||
|
||||
### BERT
|
||||
|
||||
data:image/s3,"s3://crabby-images/1547e/1547ea066a63d3c90eff13419035d1e818860468" alt="BERT_seq"
|
||||
|
||||
### BERT
|
||||
<img src="./docs/images/allinone/BERT_allin1.png" width=800/>
|
||||
|
||||
|
||||
- 2x faster training
|
||||
- 50% longer sequence length
|
||||
|
|
Binary file not shown.
After Width: | Height: | Size: 264 KiB |
Binary file not shown.
After Width: | Height: | Size: 1.1 MiB |
Binary file not shown.
After Width: | Height: | Size: 603 KiB |
Binary file not shown.
After Width: | Height: | Size: 1.3 MiB |
Binary file not shown.
After Width: | Height: | Size: 545 KiB |
Loading…
Reference in New Issue