History

binmakeswell 155e202318 [example] update auto_parallel img path (#1910 )		2 years ago
..
auto_parallel	[example] update auto_parallel img path (#1910 )	2 years ago
hybrid_parallel	[tutorial] edited hands-on practices (#1899 )	2 years ago
large_batch_optimizer	[tutorial] edited hands-on practices (#1899 )	2 years ago
opt	[tutorial] edited hands-on practices (#1899 )	2 years ago
sequence_parallel	[tutorial] edited hands-on practices (#1899 )	2 years ago
stable_diffusion	[tutorial] add cifar10 for diffusion (#1907 )	2 years ago
README.md	[example] update auto_parallel img path (#1910 )	2 years ago

README.md

Colossal-AI Tutorial Hands-on

Introduction

Welcome to the Colossal-AI tutorial, which has been accepted as official tutorials by top conference SC, AAAI, PPoPP, etc.

Colossal-AI, a unified deep learning system for the big model era, integrates many advanced technologies such as multi-dimensional tensor parallelism, sequence parallelism, heterogeneous memory management, large-scale optimization, adaptive task scheduling, etc. By using Colossal-AI, we could help users to efficiently and quickly deploy large AI model training and inference, reducing large AI model training budgets and scaling down the labor cost of learning and deployment.

🚀 Quick Links

Colossal-AI | Paper | Documentation | Forum | Slack

Table of Content

Multi-dimensional Parallelism
- Know the components and sketch of Colossal-AI
- Step-by-step from PyTorch to Colossal-AI
- Try data/pipeline parallelism and 1D/2D/2.5D/3D tensor parallelism using a unified model
Sequence Parallelism
- Try sequence parallelism with BERT
- Combination of data/pipeline/sequence parallelism
- Faster training and longer sequence length
Large Batch Training Optimization
Comparison of small/large batch size with SGD/LARS optimizer
Acceleration from a larger batch size
Auto-Parallelism
- Parallelism with normal non-distributed training code
- Model tracing + solution solving + runtime communication inserting all in one auto-parallelism system
- Try single program, multiple data (SPMD) parallel with auto-parallelism SPMD solver on ResNet50
Fine-tuning and Serving for OPT
- Try OPT model imported from Hugging Face with Colossal-AI
- Fine-tuning OPT with limited hardware using ZeRO, Gemini and parallelism
- Deploy the fine-tuned model to inference service
Acceleration of Stable Diffusion
- Stable Diffusion with Lightning
- Try Lightning Colossal-AI strategy to optimize memory and accelerate speed

Discussion

Discussion about the Colossal-AI project is always welcomed! We would love to exchange ideas with the community to better help this project grow. If you think there is a need to discuss anything, you may jump to our Slack.

If you encounter any problem while running these tutorials, you may want to raise an issue in this repository.