ColossalAI/README.md

# Colossal-AI
<div id="top" align="center">

   [![logo](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Colossal-AI_logo.png)](https://www.colossalai.org/)

   Colossal-AI: A Unified Deep Learning System for Big Model Era

   <h3> <a href="https://arxiv.org/abs/2110.14883"> Paper </a> | 
   <a href="https://www.colossalai.org/"> Documentation </a> | 
   <a href="https://github.com/hpcaitech/ColossalAI-Examples"> Examples </a> |   
   <a href="https://github.com/hpcaitech/ColossalAI/discussions"> Forum </a> | 
   <a href="https://medium.com/@hpcaitech"> Blog </a></h3>

   [![Build](https://github.com/hpcaitech/ColossalAI/actions/workflows/build.yml/badge.svg)](https://github.com/hpcaitech/ColossalAI/actions/workflows/build.yml)
   [![Documentation](https://readthedocs.org/projects/colossalai/badge/?version=latest)](https://colossalai.readthedocs.io/en/latest/?badge=latest)
   [![CodeFactor](https://www.codefactor.io/repository/github/hpcaitech/colossalai/badge)](https://www.codefactor.io/repository/github/hpcaitech/colossalai)
   [![HuggingFace badge](https://img.shields.io/badge/%F0%9F%A4%97HuggingFace-Join-yellow)](https://huggingface.co/hpcai-tech)
   [![slack badge](https://img.shields.io/badge/Slack-join-blueviolet?logo=slack&amp)](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w)
   [![WeChat badge](https://img.shields.io/badge/微信-加入-green?logo=wechat&amp)](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png)
   

   | [English](README.md) | [中文](README-zh-Hans.md) |

</div>

## Table of Contents
<ul>
 <li><a href="#Why-Colossal-AI">Why Colossal-AI</a> </li>
 <li><a href="#Features">Features</a> </li>
 <li>
   <a href="#Parallel-Training-Demo">Parallel Training Demo</a> 
   <ul>
     <li><a href="#ViT">ViT</a></li>
     <li><a href="#GPT-3">GPT-3</a></li>
     <li><a href="#GPT-2">GPT-2</a></li>
     <li><a href="#BERT">BERT</a></li>
     <li><a href="#PaLM">PaLM</a></li>
   </ul>
 </li>
 <li>
   <a href="#Single-GPU-Training-Demo">Single GPU Training Demo</a> 
   <ul>
     <li><a href="#GPT-2-Single">GPT-2</a></li>
     <li><a href="#PaLM-Single">PaLM</a></li>
   </ul>
 </li>
 <li>
   <a href="#Inference-Energon-AI-Demo">Inference (Energon-AI) Demo</a> 
   <ul>
     <li><a href="#GPT-3-Inference">GPT-3</a></li>
   </ul>
 </li>
 <li>
   <a href="#Installation">Installation</a>
   <ul>
     <li><a href="#PyPI">PyPI</a></li>
     <li><a href="#Install-From-Source">Install From Source</a></li>
   </ul>
 </li>
 <li><a href="#Use-Docker">Use Docker</a></li>
 <li><a href="#Community">Community</a></li>
 <li><a href="#contributing">Contributing</a></li>
 <li><a href="#Quick-View">Quick View</a></li>
   <ul>
     <li><a href="#Start-Distributed-Training-in-Lines">Start Distributed Training in Lines</a></li>
     <li><a href="#Write-a-Simple-2D-Parallel-Model">Write a Simple 2D Parallel Model</a></li>
   </ul>
 <li><a href="#Cite-Us">Cite Us</a></li>
</ul>

## Why Colossal-AI
<div align="center">
   <a href="https://youtu.be/KnXSfjqkKN0">
   <img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/JamesDemmel_Colossal-AI.png" width="600" />
   </a>

   Prof. James Demmel (UC Berkeley): Colossal-AI makes distributed training efficient, easy and scalable.
</div>

<p align="right">(<a href="#top">back to top</a>)</p>

## Features

Colossal-AI provides a collection of parallel components for you. We aim to support you to write your
distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart
distributed training and inference in a few lines.

- Parallelism strategies
  - Data Parallelism
  - Pipeline Parallelism
  - 1D, [2D](https://arxiv.org/abs/2104.05343), [2.5D](https://arxiv.org/abs/2105.14500), [3D](https://arxiv.org/abs/2105.14450) Tensor Parallelism
  - [Sequence Parallelism](https://arxiv.org/abs/2105.13120)
  - [Zero Redundancy Optimizer (ZeRO)](https://arxiv.org/abs/1910.02054)

- Heterogeneous Memory Menagement 
  - [PatrickStar](https://arxiv.org/abs/2108.05818)

- Friendly Usage
  - Parallelism based on configuration file

- Inference
  - [Energon-AI](https://github.com/hpcaitech/EnergonAI)

<p align="right">(<a href="#top">back to top</a>)</p>

## Parallel Training Demo
### ViT
<p align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/ViT.png" width="450" />
</p>

- 14x larger batch size, and 5x faster training for Tensor Parallelism = 64

### GPT-3
<p align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT3.png" width=700/>
</p>

- Save 50% GPU resources, and 10.7% acceleration

### GPT-2
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2.png" width=800/>

- 11x lower GPU memory consumption, and superlinear scaling efficiency with Tensor Parallelism

<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/(updated)GPT-2.png" width=800>

- 24x larger model size on the same hardware
- over 3x acceleration
### BERT
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/BERT.png" width=800/>

- 2x faster training, or 50% longer sequence length

### PaLM
- [PaLM-colossalai](https://github.com/hpcaitech/PaLM-colossalai): Scalable implementation of Google's Pathways Language Model ([PaLM](https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html)).

Please visit our [documentation and tutorials](https://www.colossalai.org/) for more details.

<p align="right">(<a href="#top">back to top</a>)</p>

## Single GPU Training Demo

### GPT-2
<p id="GPT-2-Single" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2-GPU1.png" width=450/>
</p>

- 20x larger model size on the same hardware

### PaLM
<p id="PaLM-Single" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/PaLM-GPU1.png" width=450/>
</p>

- 34x larger model size on the same hardware

<p align="right">(<a href="#top">back to top</a>)</p>


## Inference (Energon-AI) Demo

### GPT-3
<p id="GPT-3-Inference" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/inference_GPT-3.jpg" width=800/>
</p>

- [Energon-AI](https://github.com/hpcaitech/EnergonAI): 50% inference acceleration on the same hardware

<p align="right">(<a href="#top">back to top</a>)</p>

## Installation

### Download From Official Releases

You can visit the [Download](https://www.colossalai.org/download) page to download Colossal-AI with pre-built CUDA extensions.


### Download From Source

> The version of Colossal-AI will be in line with the main branch of the repository. Feel free to raise an issue if you encounter any problem. :)

```shell
git clone https://github.com/hpcaitech/ColossalAI.git
cd ColossalAI

# install dependency
pip install -r requirements/requirements.txt

# install colossalai
pip install .
```

If you don't want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer):

```shell
NO_CUDA_EXT=1 pip install .
```

<p align="right">(<a href="#top">back to top</a>)</p>

## Use Docker

Run the following command to build a docker image from Dockerfile provided.

> Building Colossal-AI from scratch requires GPU support, you need to use Nvidia Docker Runtime as the default when doing `docker build`. More details can be found [here](https://stackoverflow.com/questions/59691207/docker-build-with-nvidia-runtime).
> We recommend you install Colossal-AI from our [project page](https://www.colossalai.org) directly.

```bash
cd ColossalAI
docker build -t colossalai ./docker
```

Run the following command to start the docker container in interactive mode.

```bash
docker run -ti --gpus all --rm --ipc=host colossalai bash
```

<p align="right">(<a href="#top">back to top</a>)</p>

## Community

Join the Colossal-AI community on [Forum](https://github.com/hpcaitech/ColossalAI/discussions),
[Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w),
and [WeChat](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png "qrcode") to share your suggestions, feedback, and questions with our engineering team.

## Contributing

If you wish to contribute to this project, please follow the guideline in [Contributing](./CONTRIBUTING.md).

Thanks so much to all of our amazing contributors!

<a href="https://github.com/hpcaitech/ColossalAI/graphs/contributors"><img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/contributor_avatar.png" width="800px"></a>

*The order of contributor avatars is randomly shuffled.*

<p align="right">(<a href="#top">back to top</a>)</p>

## Quick View

### Start Distributed Training in Lines

```python
parallel = dict(
    pipeline=2,
    tensor=dict(mode='2.5d', depth = 1, size=4)
)
```

### Start Heterogeneous Training in Lines

```python
zero = dict(
    model_config=dict(
        tensor_placement_policy='auto',
        shard_strategy=TensorShardStrategy(),
        reuse_fp16_shard=True
    ),
    optimizer_config=dict(initial_scale=2**5, gpu_margin_mem_ratio=0.2)
)

```

<p align="right">(<a href="#top">back to top</a>)</p>

## Cite Us

```
@article{bian2021colossal,
  title={Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training},
  author={Bian, Zhengda and Liu, Hongxin and Wang, Boxiang and Huang, Haichen and Li, Yongbin and Wang, Chuanrui and Cui, Fan and You, Yang},
  journal={arXiv preprint arXiv:2110.14883},
  year={2021}
}
```

<p align="right">(<a href="#top">back to top</a>)</p>
fixed some typos in the documents, added blog link and paper author information in README 2021-11-03 08:07:28 +00:00			`# Colossal-AI`
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<div id="top" align="center">`
removed tutorial markdown and refreshed rst files for consistency 2022-01-19 08:06:53 +00:00
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`[![logo](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Colossal-AI_logo.png)](https://www.colossalai.org/)`

[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`Colossal-AI: A Unified Deep Learning System for Big Model Era`
removed tutorial markdown and refreshed rst files for consistency 2022-01-19 08:06:53 +00:00
fixed utils docstring and add example to readme (#200) 2022-02-03 03:37:17 +00:00			`<h3> <a href="https://arxiv.org/abs/2110.14883"> Paper </a> \|`
			`<a href="https://www.colossalai.org/"> Documentation </a> \|`
			`<a href="https://github.com/hpcaitech/ColossalAI-Examples"> Examples </a> \|`
			`<a href="https://github.com/hpcaitech/ColossalAI/discussions"> Forum </a> \|`
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<a href="https://medium.com/@hpcaitech"> Blog </a></h3>`
updated readme and change log (#224) 2022-02-14 09:22:48 +00:00
fixed broken badge link 2022-03-13 01:11:48 +00:00			`[![Build](https://github.com/hpcaitech/ColossalAI/actions/workflows/build.yml/badge.svg)](https://github.com/hpcaitech/ColossalAI/actions/workflows/build.yml)`
Update workflow files and README.md (#166) 2022-01-19 12:15:14 +00:00			`[![Documentation](https://readthedocs.org/projects/colossalai/badge/?version=latest)](https://colossalai.readthedocs.io/en/latest/?badge=latest)`
[misc] replace codebeat with codefactor on readme (#436) 2022-03-16 09:43:52 +00:00			`[![CodeFactor](https://www.codefactor.io/repository/github/hpcaitech/colossalai/badge)](https://www.codefactor.io/repository/github/hpcaitech/colossalai)`
update hf badge link (#410) 2022-03-14 09:07:01 +00:00			`[![HuggingFace badge](https://img.shields.io/badge/%F0%9F%A4%97HuggingFace-Join-yellow)](https://huggingface.co/hpcai-tech)`
add badge and contributor list 2022-03-04 10:04:51 +00:00			`[![slack badge](https://img.shields.io/badge/Slack-join-blueviolet?logo=slack&amp)](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w)`
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`[![WeChat badge](https://img.shields.io/badge/微信-加入-green?logo=wechat&amp)](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png)`
update hf badge link (#410) 2022-03-14 09:07:01 +00:00
add Chinese README 2022-02-18 08:28:37 +00:00
			`\| [English](README.md) \| [中文](README-zh-Hans.md) \|`
update README and images path (#384) 2022-03-11 05:53:38 +00:00
add logo at homepage, add forum in issue template (#161) 2022-01-19 06:29:31 +00:00			`</div>`
update documentation 2021-10-29 01:29:20 +00:00
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`## Table of Contents`
			`<ul>`
add video (#732) 2022-04-12 05:41:56 +00:00			`<li><a href="#Why-Colossal-AI">Why Colossal-AI</a> </li>`
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<li><a href="#Features">Features</a> </li>`
			`<li>`
[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`<a href="#Parallel-Training-Demo">Parallel Training Demo</a>`
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<ul>`
			`<li><a href="#ViT">ViT</a></li>`
			`<li><a href="#GPT-3">GPT-3</a></li>`
			`<li><a href="#GPT-2">GPT-2</a></li>`
			`<li><a href="#BERT">BERT</a></li>`
add PaLM link (#704) (#705) 2022-04-08 10:42:12 +00:00			`<li><a href="#PaLM">PaLM</a></li>`
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`</ul>`
			`</li>`
update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00			`<li>`
[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`<a href="#Single-GPU-Training-Demo">Single GPU Training Demo</a>`
update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00			`<ul>`
			`<li><a href="#GPT-2-Single">GPT-2</a></li>`
			`<li><a href="#PaLM-Single">PaLM</a></li>`
			`</ul>`
			`</li>`
[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`<li>`
add inference submodule (#1047) 2022-05-31 11:57:39 +00:00			`<a href="#Inference-Energon-AI-Demo">Inference (Energon-AI) Demo</a>`
[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`<ul>`
			`<li><a href="#GPT-3-Inference">GPT-3</a></li>`
			`</ul>`
			`</li>`
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<li>`
			`<a href="#Installation">Installation</a>`
			`<ul>`
			`<li><a href="#PyPI">PyPI</a></li>`
			`<li><a href="#Install-From-Source">Install From Source</a></li>`
			`</ul>`
			`</li>`
			`<li><a href="#Use-Docker">Use Docker</a></li>`
			`<li><a href="#Community">Community</a></li>`
			`<li><a href="#contributing">Contributing</a></li>`
			`<li><a href="#Quick-View">Quick View</a></li>`
			`<ul>`
			`<li><a href="#Start-Distributed-Training-in-Lines">Start Distributed Training in Lines</a></li>`
			`<li><a href="#Write-a-Simple-2D-Parallel-Model">Write a Simple 2D Parallel Model</a></li>`
			`</ul>`
			`<li><a href="#Cite-Us">Cite Us</a></li>`
			`</ul>`
add Chinese README 2022-02-18 08:28:37 +00:00
add video (#732) 2022-04-12 05:41:56 +00:00			`## Why Colossal-AI`
			`<div align="center">`
			`<a href="https://youtu.be/KnXSfjqkKN0">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/JamesDemmel_Colossal-AI.png" width="600" />`
			`</a>`

			`Prof. James Demmel (UC Berkeley): Colossal-AI makes distributed training efficient, easy and scalable.`
			`</div>`

			`<p align="right">(<a href="#top">back to top</a>)</p>`

add Chinese README 2022-02-18 08:28:37 +00:00			`## Features`

[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`Colossal-AI provides a collection of parallel components for you. We aim to support you to write your`
Update README.md (#514) 2022-03-25 04:12:05 +00:00			`distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart`
[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`distributed training and inference in a few lines.`
add Chinese README 2022-02-18 08:28:37 +00:00
[readme] polish readme (#764) * [readme] polish readme * centering image 2022-04-14 09:34:08 +00:00			`- Parallelism strategies`
			`- Data Parallelism`
			`- Pipeline Parallelism`
[readme] sync CN readme (#766) 2022-04-14 13:04:51 +00:00			`- 1D, [2D](https://arxiv.org/abs/2104.05343), [2.5D](https://arxiv.org/abs/2105.14500), [3D](https://arxiv.org/abs/2105.14450) Tensor Parallelism`
			`- [Sequence Parallelism](https://arxiv.org/abs/2105.13120)`
[NFC] fix paper link 2022-05-21 10:31:11 +00:00			`- [Zero Redundancy Optimizer (ZeRO)](https://arxiv.org/abs/1910.02054)`
[readme] polish readme (#764) * [readme] polish readme * centering image 2022-04-14 09:34:08 +00:00
			`- Heterogeneous Memory Menagement`
			`- [PatrickStar](https://arxiv.org/abs/2108.05818)`

			`- Friendly Usage`
[readme] sync CN readme (#766) 2022-04-14 13:04:51 +00:00			`- Parallelism based on configuration file`
add Chinese README 2022-02-18 08:28:37 +00:00
[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`- Inference`
			`- [Energon-AI](https://github.com/hpcaitech/EnergonAI)`

update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<p align="right">(<a href="#top">back to top</a>)</p>`

[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`## Parallel Training Demo`
add Chinese README 2022-02-18 08:28:37 +00:00			`### ViT`
[readme] polish readme (#764) * [readme] polish readme * centering image 2022-04-14 09:34:08 +00:00			`<p align="center">`
Fix/format (#366) 2022-03-10 05:32:56 +00:00			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/ViT.png" width="450" />`
[readme] polish readme (#764) * [readme] polish readme * centering image 2022-04-14 09:34:08 +00:00			`</p>`
add Chinese README 2022-02-18 08:28:37 +00:00
Update README.md (#514) 2022-03-25 04:12:05 +00:00			`- 14x larger batch size, and 5x faster training for Tensor Parallelism = 64`
add Chinese README 2022-02-18 08:28:37 +00:00
update experimental visualization (#253) 2022-02-28 08:03:13 +00:00			`### GPT-3`
[readme] polish readme (#764) * [readme] polish readme * centering image 2022-04-14 09:34:08 +00:00			`<p align="center">`
Fix/format (#366) 2022-03-10 05:32:56 +00:00			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT3.png" width=700/>`
[readme] polish readme (#764) * [readme] polish readme * centering image 2022-04-14 09:34:08 +00:00			`</p>`
add Chinese README 2022-02-18 08:28:37 +00:00
Update README.md (#514) 2022-03-25 04:12:05 +00:00			`- Save 50% GPU resources, and 10.7% acceleration`
update experimental visualization (#253) 2022-02-28 08:03:13 +00:00
			`### GPT-2`
Fix/format (#366) 2022-03-10 05:32:56 +00:00			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2.png" width=800/>`
update experimental visualization (#253) 2022-02-28 08:03:13 +00:00
Update README.md (#514) 2022-03-25 04:12:05 +00:00			`- 11x lower GPU memory consumption, and superlinear scaling efficiency with Tensor Parallelism`
update experimental visualization (#253) 2022-02-28 08:03:13 +00:00
update GPT-2 experiment result (#666) 2022-04-04 05:47:43 +00:00			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/(updated)GPT-2.png" width=800>`
Update Experiment result about Colossal-AI with ZeRO (#479) * [readme] add experimental visualisation regarding ColossalAI with ZeRO (#476) * Hotfix/readme (#478) * add experimental visualisation regarding ColossalAI with ZeRO * adjust newly-added figure size 2022-03-21 08:34:07 +00:00
update GPT-2 experiment result (#666) 2022-04-04 05:47:43 +00:00			`- 24x larger model size on the same hardware`
			`- over 3x acceleration`
add Chinese README 2022-02-18 08:28:37 +00:00			`### BERT`
Fix/format (#366) 2022-03-10 05:32:56 +00:00			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/BERT.png" width=800/>`
add Chinese README 2022-02-18 08:28:37 +00:00
add community group and update issue template(#271) 2022-02-28 09:07:14 +00:00			`- 2x faster training, or 50% longer sequence length`
add Chinese README 2022-02-18 08:28:37 +00:00
add PaLM link (#704) * add PaLM link 2022-04-08 10:26:59 +00:00			`### PaLM`
			`- [PaLM-colossalai](https://github.com/hpcaitech/PaLM-colossalai): Scalable implementation of Google's Pathways Language Model ([PaLM](https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html)).`

add Chinese README 2022-02-18 08:28:37 +00:00			`Please visit our [documentation and tutorials](https://www.colossalai.org/) for more details.`

update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<p align="right">(<a href="#top">back to top</a>)</p>`
add Chinese README 2022-02-18 08:28:37 +00:00
[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00			`## Single GPU Training Demo`
update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00
			`### GPT-2`
			`<p id="GPT-2-Single" align="center">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2-GPU1.png" width=450/>`
			`</p>`

			`- 20x larger model size on the same hardware`

			`### PaLM`
			`<p id="PaLM-Single" align="center">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/PaLM-GPU1.png" width=450/>`
			`</p>`

			`- 34x larger model size on the same hardware`

			`<p align="right">(<a href="#top">back to top</a>)</p>`

[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00
add inference submodule (#1047) 2022-05-31 11:57:39 +00:00			`## Inference (Energon-AI) Demo`
[NFC] add inference (#1044) 2022-05-30 15:06:49 +00:00
			`### GPT-3`
			`<p id="GPT-3-Inference" align="center">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/inference_GPT-3.jpg" width=800/>`
			`</p>`

			`- [Energon-AI](https://github.com/hpcaitech/EnergonAI): 50% inference acceleration on the same hardware`

			`<p align="right">(<a href="#top">back to top</a>)</p>`

Migrated project 2021-10-28 16:21:23 +00:00			`## Installation`

update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00			`### Download From Official Releases`
update setup and workflow (#222) 2022-02-14 09:09:30 +00:00
fix download link (#998) 2022-05-18 10:05:18 +00:00			`You can visit the [Download](https://www.colossalai.org/download) page to download Colossal-AI with pre-built CUDA extensions.`
update examples and sphnix docs for the new api (#63) 2021-12-13 14:07:01 +00:00
update setup and workflow (#222) 2022-02-14 09:09:30 +00:00
update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00			`### Download From Source`
update setup and workflow (#222) 2022-02-14 09:09:30 +00:00
update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00			`> The version of Colossal-AI will be in line with the main branch of the repository. Feel free to raise an issue if you encounter any problem. :)`
Migrated project 2021-10-28 16:21:23 +00:00
			```shell
update examples and sphnix docs for the new api (#63) 2021-12-13 14:07:01 +00:00			`git clone https://github.com/hpcaitech/ColossalAI.git`
Migrated project 2021-10-28 16:21:23 +00:00			`cd ColossalAI`
update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00
Migrated project 2021-10-28 16:21:23 +00:00			`# install dependency`
			`pip install -r requirements/requirements.txt`

			`# install colossalai`
			`pip install .`
			```

update setup and workflow (#222) 2022-02-14 09:09:30 +00:00			`If you don't want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer):`
Migrated project 2021-10-28 16:21:23 +00:00
			```shell
update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00			`NO_CUDA_EXT=1 pip install .`
Migrated project 2021-10-28 16:21:23 +00:00			```

update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<p align="right">(<a href="#top">back to top</a>)</p>`
add badge and contributor list 2022-03-04 10:04:51 +00:00
added docker documentation (#152) 2022-01-18 05:35:18 +00:00			`## Use Docker`

			`Run the following command to build a docker image from Dockerfile provided.`

[doc] update docker instruction (#1020) 2022-05-24 09:51:50 +00:00			> Building Colossal-AI from scratch requires GPU support, you need to use Nvidia Docker Runtime as the default when doing `docker build`. More details can be found [here](https://stackoverflow.com/questions/59691207/docker-build-with-nvidia-runtime).
			`> We recommend you install Colossal-AI from our [project page](https://www.colossalai.org) directly.`

added docker documentation (#152) 2022-01-18 05:35:18 +00:00			```bash
			`cd ColossalAI`
			`docker build -t colossalai ./docker`
			```

			`Run the following command to start the docker container in interactive mode.`

			```bash
			`docker run -ti --gpus all --rm --ipc=host colossalai bash`
			```

update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<p align="right">(<a href="#top">back to top</a>)</p>`
add badge and contributor list 2022-03-04 10:04:51 +00:00
			`## Community`

			`Join the Colossal-AI community on [Forum](https://github.com/hpcaitech/ColossalAI/discussions),`
			`[Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w),`
Update README.md (#514) 2022-03-25 04:12:05 +00:00			`and [WeChat](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png "qrcode") to share your suggestions, feedback, and questions with our engineering team.`
add badge and contributor list 2022-03-04 10:04:51 +00:00
updated readme and change log (#224) 2022-02-14 09:22:48 +00:00			`## Contributing`

add badge and contributor list 2022-03-04 10:04:51 +00:00			`If you wish to contribute to this project, please follow the guideline in [Contributing](./CONTRIBUTING.md).`

			`Thanks so much to all of our amazing contributors!`
updated readme and change log (#224) 2022-02-14 09:22:48 +00:00
add badge and contributor list 2022-03-04 10:04:51 +00:00			`<a href="https://github.com/hpcaitech/ColossalAI/graphs/contributors"><img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/contributor_avatar.png" width="800px"></a>`

			`The order of contributor avatars is randomly shuffled.`
updated readme and change log (#224) 2022-02-14 09:22:48 +00:00
update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<p align="right">(<a href="#top">back to top</a>)</p>`

Migrated project 2021-10-28 16:21:23 +00:00			`## Quick View`

			`### Start Distributed Training in Lines`

			```python
update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00			`parallel = dict(`
			`pipeline=2,`
			`tensor=dict(mode='2.5d', depth = 1, size=4)`
update markdown docs (english) (#60) 2021-12-10 06:37:33 +00:00			`)`
Migrated project 2021-10-28 16:21:23 +00:00			```

update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00			`### Start Heterogeneous Training in Lines`
Migrated project 2021-10-28 16:21:23 +00:00
			```python
update results on a single GPU, highlight quick view (#981) 2022-05-16 13:14:35 +00:00			`zero = dict(`
			`model_config=dict(`
			`tensor_placement_policy='auto',`
			`shard_strategy=TensorShardStrategy(),`
			`reuse_fp16_shard=True`
			`),`
			`optimizer_config=dict(initial_scale=2**5, gpu_margin_mem_ratio=0.2)`
			`)`
Migrated project 2021-10-28 16:21:23 +00:00
			```

update README and images path (#384) 2022-03-11 05:53:38 +00:00			`<p align="right">(<a href="#top">back to top</a>)</p>`
Migrated project 2021-10-28 16:21:23 +00:00
fixed some typos in the documents, added blog link and paper author information in README 2021-11-03 08:07:28 +00:00			`## Cite Us`
Migrated project 2021-10-28 16:21:23 +00:00
fixed some typos in the documents, added blog link and paper author information in README 2021-11-03 08:07:28 +00:00			```
			`@article{bian2021colossal,`
			`title={Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training},`
			`author={Bian, Zhengda and Liu, Hongxin and Wang, Boxiang and Huang, Haichen and Li, Yongbin and Wang, Chuanrui and Cui, Fan and You, Yang},`
			`journal={arXiv preprint arXiv:2110.14883},`
			`year={2021}`
			`}`
			```
update README and images path (#384) 2022-03-11 05:53:38 +00:00
[profiler] add MemProfiler (#356) * add memory trainer hook * fix bug * add memory trainer hook * fix import bug * fix import bug * add trainer hook * fix #370 git log bug * modify `to_tensorboard` function to support better output * remove useless output * change the name of `MemProfiler` * complete memory profiler * replace error with warning * finish trainer hook * modify interface of MemProfiler * modify `__init__.py` in profiler * remove unnecessary pass statement * add usage to doc string * add usage to trainer hook * new location to store temp data file 2022-03-29 04:48:34 +00:00			`<p align="right">(<a href="#top">back to top</a>)</p>`