mirror of https://github.com/hpcaitech/ColossalAI
aibig-modeldata-parallelismdeep-learningdistributed-computingfoundation-modelsheterogeneous-traininghpcinferencelarge-scalemodel-parallelismpipeline-parallelism
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Frank Lee
8b7495dd54
|
2 years ago | |
---|---|---|
.. | ||
README.md | 2 years ago | |
config.py | 2 years ago | |
requirements.txt | 2 years ago | |
test_ci.sh | 2 years ago | |
train.py | 2 years ago |
README.md
Large Batch Training Optimization
Table of contents
📚 Overview
This example lets you to quickly try out the large batch training optimization provided by Colossal-AI. We use synthetic dataset to go through the process, thus, you don't need to prepare any dataset. You can try out the Lamb
and Lars
optimizers from Colossal-AI with the following code.
from colossalai.nn.optimizer import Lamb, Lars
🚀 Quick Start
-
Install PyTorch
-
Install the dependencies.
pip install -r requirements.txt
- Run the training scripts with synthetic data.
# run on 4 GPUs
# run with lars
colossalai run --nproc_per_node 4 train.py --config config.py --optimizer lars
# run with lamb
colossalai run --nproc_per_node 4 train.py --config config.py --optimizer lamb