mirror of https://github.com/hpcaitech/ColossalAI
[format] applied code formatting on changed files in pull request 2997 (#3008)
Co-authored-by: github-actions <github-actions@github.com>pull/3017/head
parent
52a5078988
commit
82503a96f2
|
@ -2,30 +2,30 @@
|
||||||
|
|
||||||
## Introduction
|
## Introduction
|
||||||
|
|
||||||
Welcome to the large-scale deep learning optimization techniques of [Colossal-AI](https://github.com/hpcaitech/ColossalAI),
|
Welcome to the large-scale deep learning optimization techniques of [Colossal-AI](https://github.com/hpcaitech/ColossalAI),
|
||||||
which has been accepted as official tutorials by top conference [SC](https://sc22.supercomputing.org/), [AAAI](https://aaai.org/Conferences/AAAI-23/), [PPoPP](https://ppopp23.sigplan.org/), [CVPR](https://cvpr2023.thecvf.com/), [ISC](https://www.isc-hpc.com/), etc.
|
which has been accepted as official tutorials by top conference [SC](https://sc22.supercomputing.org/), [AAAI](https://aaai.org/Conferences/AAAI-23/), [PPoPP](https://ppopp23.sigplan.org/), [CVPR](https://cvpr2023.thecvf.com/), [ISC](https://www.isc-hpc.com/), etc.
|
||||||
|
|
||||||
|
|
||||||
[Colossal-AI](https://github.com/hpcaitech/ColossalAI), a unified deep learning system for the big model era, integrates
|
[Colossal-AI](https://github.com/hpcaitech/ColossalAI), a unified deep learning system for the big model era, integrates
|
||||||
many advanced technologies such as multi-dimensional tensor parallelism, sequence parallelism, heterogeneous memory management,
|
many advanced technologies such as multi-dimensional tensor parallelism, sequence parallelism, heterogeneous memory management,
|
||||||
large-scale optimization, adaptive task scheduling, etc. By using Colossal-AI, we could help users to efficiently and
|
large-scale optimization, adaptive task scheduling, etc. By using Colossal-AI, we could help users to efficiently and
|
||||||
quickly deploy large AI model training and inference, reducing large AI model training budgets and scaling down the labor cost of learning and deployment.
|
quickly deploy large AI model training and inference, reducing large AI model training budgets and scaling down the labor cost of learning and deployment.
|
||||||
|
|
||||||
### 🚀 Quick Links
|
### 🚀 Quick Links
|
||||||
|
|
||||||
[**Colossal-AI**](https://github.com/hpcaitech/ColossalAI) |
|
[**Colossal-AI**](https://github.com/hpcaitech/ColossalAI) |
|
||||||
[**Paper**](https://arxiv.org/abs/2110.14883) |
|
[**Paper**](https://arxiv.org/abs/2110.14883) |
|
||||||
[**Documentation**](https://www.colossalai.org/) |
|
[**Documentation**](https://www.colossalai.org/) |
|
||||||
[**Forum**](https://github.com/hpcaitech/ColossalAI/discussions) |
|
[**Forum**](https://github.com/hpcaitech/ColossalAI/discussions) |
|
||||||
[**Slack**](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w)
|
[**Slack**](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w)
|
||||||
|
|
||||||
|
|
||||||
## Table of Content
|
## Table of Content
|
||||||
|
|
||||||
Large transformer models display promising performance on a wide spectrum of AI applications.
|
Large transformer models display promising performance on a wide spectrum of AI applications.
|
||||||
Both academia and industry are scaling DL training on larger clusters. However, degrading generalization performance, non-negligible communication overhead, and increasing model size prevent DL researchers and engineers from exploring large-scale AI models.
|
Both academia and industry are scaling DL training on larger clusters. However, degrading generalization performance, non-negligible communication overhead, and increasing model size prevent DL researchers and engineers from exploring large-scale AI models.
|
||||||
|
|
||||||
We aim to provide a clear sketch of the optimizations for large-scale deep learning with regard to model accuracy and model efficiency.
|
We aim to provide a clear sketch of the optimizations for large-scale deep learning with regard to model accuracy and model efficiency.
|
||||||
One way to achieve the goal of maintaining or improving the model accuracy in the large-scale setting while maintaining compute efficiency is to design algorithms that
|
One way to achieve the goal of maintaining or improving the model accuracy in the large-scale setting while maintaining compute efficiency is to design algorithms that
|
||||||
are less communication and memory hungry. Notably, they are not mutually exclusive but can
|
are less communication and memory hungry. Notably, they are not mutually exclusive but can
|
||||||
be optimized jointly to further speed up training.
|
be optimized jointly to further speed up training.
|
||||||
|
@ -51,7 +51,7 @@ be optimized jointly to further speed up training.
|
||||||
- Memory Efficiency
|
- Memory Efficiency
|
||||||
- Mix-Precision Training
|
- Mix-Precision Training
|
||||||
- Memory-Efficient Methods, e.g. ZeRO, Gemini, etc.
|
- Memory-Efficient Methods, e.g. ZeRO, Gemini, etc.
|
||||||
|
|
||||||
Some of the above are still under development. **If you wish to make a contribution to this repository, please read the `Contributing` section below.**
|
Some of the above are still under development. **If you wish to make a contribution to this repository, please read the `Contributing` section below.**
|
||||||
|
|
||||||
## Discussion
|
## Discussion
|
||||||
|
@ -63,7 +63,7 @@ If you encounter any problem while running these optimizers, you may want to rai
|
||||||
|
|
||||||
## Contributing
|
## Contributing
|
||||||
|
|
||||||
This project welcomes constructive ideas and implementations from the community.
|
This project welcomes constructive ideas and implementations from the community.
|
||||||
|
|
||||||
### Update an Optimizer
|
### Update an Optimizer
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue