ColossalAI/README.md

# Colossal-AI
<div id="top" align="center">

   [![logo](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Colossal-AI_logo.png)](https://www.colossalai.org/)

   Colossal-AI: A Unified Deep Learning System for Big Model Era

   <h3> <a href="https://arxiv.org/abs/2110.14883"> Paper </a> | 
   <a href="https://www.colossalai.org/"> Documentation </a> | 
   <a href="https://github.com/hpcaitech/ColossalAI-Examples"> Examples </a> |   
   <a href="https://github.com/hpcaitech/ColossalAI/discussions"> Forum </a> | 
   <a href="https://medium.com/@hpcaitech"> Blog </a></h3>

   [![Build](https://github.com/hpcaitech/ColossalAI/actions/workflows/build.yml/badge.svg)](https://github.com/hpcaitech/ColossalAI/actions/workflows/build.yml)
   [![Documentation](https://readthedocs.org/projects/colossalai/badge/?version=latest)](https://colossalai.readthedocs.io/en/latest/?badge=latest)
   [![CodeFactor](https://www.codefactor.io/repository/github/hpcaitech/colossalai/badge)](https://www.codefactor.io/repository/github/hpcaitech/colossalai)
   [![HuggingFace badge](https://img.shields.io/badge/%F0%9F%A4%97HuggingFace-Join-yellow)](https://huggingface.co/hpcai-tech)
   [![slack badge](https://img.shields.io/badge/Slack-join-blueviolet?logo=slack&amp)](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w)
   [![WeChat badge](https://img.shields.io/badge/微信-加入-green?logo=wechat&amp)](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png)
   

   | [English](README.md) | [中文](README-zh-Hans.md) |

</div>

## Table of Contents
<ul>
 <li><a href="#Why-Colossal-AI">Why Colossal-AI</a> </li>
 <li><a href="#Features">Features</a> </li>
 <li>
   <a href="#Parallel-Training-Demo">Parallel Training Demo</a> 
   <ul>
     <li><a href="#ViT">ViT</a></li>
     <li><a href="#GPT-3">GPT-3</a></li>
     <li><a href="#GPT-2">GPT-2</a></li>
     <li><a href="#BERT">BERT</a></li>
     <li><a href="#PaLM">PaLM</a></li>
     <li><a href="#OPT">OPT</a></li>
   </ul>
 </li>
 <li>
   <a href="#Single-GPU-Training-Demo">Single GPU Training Demo</a> 
   <ul>
     <li><a href="#GPT-2-Single">GPT-2</a></li>
     <li><a href="#PaLM-Single">PaLM</a></li>
   </ul>
 </li>
 <li>
   <a href="#Inference-Energon-AI-Demo">Inference (Energon-AI) Demo</a> 
   <ul>
     <li><a href="#GPT-3-Inference">GPT-3</a></li>
   </ul>
 </li>
   <li>
   <a href="#Colossal-AI-in-the-Real-World">Colossal-AI in the Real World</a> 
   <ul>
     <li><a href="#xTrimoMultimer">xTrimoMultimer: Accelerating Protein Monomer and Multimer Structure Prediction</a></li>
   </ul>
 </li>
 <li>
   <a href="#Installation">Installation</a>
   <ul>
     <li><a href="#PyPI">PyPI</a></li>
     <li><a href="#Install-From-Source">Install From Source</a></li>
   </ul>
 </li>
 <li><a href="#Use-Docker">Use Docker</a></li>
 <li><a href="#Community">Community</a></li>
 <li><a href="#contributing">Contributing</a></li>
 <li><a href="#Quick-View">Quick View</a></li>
   <ul>
     <li><a href="#Start-Distributed-Training-in-Lines">Start Distributed Training in Lines</a></li>
     <li><a href="#Write-a-Simple-2D-Parallel-Model">Write a Simple 2D Parallel Model</a></li>
   </ul>
 <li><a href="#Cite-Us">Cite Us</a></li>
</ul>

## Why Colossal-AI
<div align="center">
   <a href="https://youtu.be/KnXSfjqkKN0">
   <img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/JamesDemmel_Colossal-AI.png" width="600" />
   </a>

   Prof. James Demmel (UC Berkeley): Colossal-AI makes training AI models efficient, easy, and scalable.
</div>

<p align="right">(<a href="#top">back to top</a>)</p>

## Features

Colossal-AI provides a collection of parallel components for you. We aim to support you to write your
distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart
distributed training and inference in a few lines.

- Parallelism strategies
  - Data Parallelism
  - Pipeline Parallelism
  - 1D, [2D](https://arxiv.org/abs/2104.05343), [2.5D](https://arxiv.org/abs/2105.14500), [3D](https://arxiv.org/abs/2105.14450) Tensor Parallelism
  - [Sequence Parallelism](https://arxiv.org/abs/2105.13120)
  - [Zero Redundancy Optimizer (ZeRO)](https://arxiv.org/abs/1910.02054)

- Heterogeneous Memory Management 
  - [PatrickStar](https://arxiv.org/abs/2108.05818)

- Friendly Usage
  - Parallelism based on configuration file

- Inference
  - [Energon-AI](https://github.com/hpcaitech/EnergonAI)

- Colossal-AI in the Real World 
  - [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): Accelerating Protein Monomer and Multimer Structure Prediction
<p align="right">(<a href="#top">back to top</a>)</p>

## Parallel Training Demo
### ViT
<p align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/ViT.png" width="450" />
</p>

- 14x larger batch size, and 5x faster training for Tensor Parallelism = 64

### GPT-3
<p align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT3-v5.png" width=700/>
</p>

- Save 50% GPU resources, and 10.7% acceleration

### GPT-2
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2.png" width=800/>

- 11x lower GPU memory consumption, and superlinear scaling efficiency with Tensor Parallelism

<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/(updated)GPT-2.png" width=800>

- 24x larger model size on the same hardware
- over 3x acceleration
### BERT
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/BERT.png" width=800/>

- 2x faster training, or 50% longer sequence length

### PaLM
- [PaLM-colossalai](https://github.com/hpcaitech/PaLM-colossalai): Scalable implementation of Google's Pathways Language Model ([PaLM](https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html)).

### OPT
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/OPT.png" width=800/>

- [Open Pretrained Transformer (OPT)](https://github.com/facebookresearch/metaseq), a 175-Billion parameter AI language model released by Meta, which stimulates AI programmers to perform various downstream tasks and application deployments because public pretrained model weights.
- 40% speedup fine-tuning OPT at low cost in lines. [[Example]](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/opt)

Please visit our [documentation](https://www.colossalai.org/) and [examples](https://github.com/hpcaitech/ColossalAI-Examples) for more details.

<p align="right">(<a href="#top">back to top</a>)</p>

## Single GPU Training Demo

### GPT-2
<p id="GPT-2-Single" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2-GPU1.png" width=450/>
</p>

- 20x larger model size on the same hardware

<p id="GPT-2-NVME" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2-NVME.png" width=800/>
</p>

- 120x larger model size on the same hardware (RTX 3080)

### PaLM
<p id="PaLM-Single" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/PaLM-GPU1.png" width=450/>
</p>

- 34x larger model size on the same hardware

<p align="right">(<a href="#top">back to top</a>)</p>


## Inference (Energon-AI) Demo

### GPT-3
<p id="GPT-3-Inference" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/inference_GPT-3.jpg" width=800/>
</p>

- [Energon-AI](https://github.com/hpcaitech/EnergonAI): 50% inference acceleration on the same hardware

<p align="right">(<a href="#top">back to top</a>)</p>

## Colossal-AI in the Real World

### xTrimoMultimer: Accelerating Protein Monomer and Multimer Structure Prediction
<p id="xTrimoMultimer" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/xTM_Prediction.jpg" width=380/>
<p></p>
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/xTrimoMultimer_Table.jpg" width=800/>
</p>

- [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): accelerating structure prediction of protein monomers and multimer by 11x

<p align="right">(<a href="#top">back to top</a>)</p>

## Installation

### Download From Official Releases

You can visit the [Download](https://www.colossalai.org/download) page to download Colossal-AI with pre-built CUDA extensions.


### Download From Source

> The version of Colossal-AI will be in line with the main branch of the repository. Feel free to raise an issue if you encounter any problem. :)

```shell
git clone https://github.com/hpcaitech/ColossalAI.git
cd ColossalAI

# install dependency
pip install -r requirements/requirements.txt

# install colossalai
pip install .
```

If you don't want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer):

```shell
NO_CUDA_EXT=1 pip install .
```

<p align="right">(<a href="#top">back to top</a>)</p>

## Use Docker

### Pull from DockerHub

You can directly pull the docker image from our [DockerHub page](https://hub.docker.com/r/hpcaitech/colossalai). The image is automatically uploaded upon release.


### Build On Your Own

Run the following command to build a docker image from Dockerfile provided.

> Building Colossal-AI from scratch requires GPU support, you need to use Nvidia Docker Runtime as the default when doing `docker build`. More details can be found [here](https://stackoverflow.com/questions/59691207/docker-build-with-nvidia-runtime).
> We recommend you install Colossal-AI from our [project page](https://www.colossalai.org) directly.


```bash
cd ColossalAI
docker build -t colossalai ./docker
```

Run the following command to start the docker container in interactive mode.

```bash
docker run -ti --gpus all --rm --ipc=host colossalai bash
```

<p align="right">(<a href="#top">back to top</a>)</p>

## Community

Join the Colossal-AI community on [Forum](https://github.com/hpcaitech/ColossalAI/discussions),
[Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w),
and [WeChat](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png "qrcode") to share your suggestions, feedback, and questions with our engineering team.

## Contributing

If you wish to contribute to this project, please follow the guideline in [Contributing](./CONTRIBUTING.md).

Thanks so much to all of our amazing contributors!

<a href="https://github.com/hpcaitech/ColossalAI/graphs/contributors"><img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/contributor_avatar.png" width="800px"></a>

*The order of contributor avatars is randomly shuffled.*

<p align="right">(<a href="#top">back to top</a>)</p>

## Quick View

### Start Distributed Training in Lines

```python
parallel = dict(
    pipeline=2,
    tensor=dict(mode='2.5d', depth = 1, size=4)
)
```

### Start Heterogeneous Training in Lines

```python
zero = dict(
    model_config=dict(
        tensor_placement_policy='auto',
        shard_strategy=TensorShardStrategy(),
        reuse_fp16_shard=True
    ),
    optimizer_config=dict(initial_scale=2**5, gpu_margin_mem_ratio=0.2)
)

```

<p align="right">(<a href="#top">back to top</a>)</p>

## Cite Us

```
@article{bian2021colossal,
  title={Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training},
  author={Bian, Zhengda and Liu, Hongxin and Wang, Boxiang and Huang, Haichen and Li, Yongbin and Wang, Chuanrui and Cui, Fan and You, Yang},
  journal={arXiv preprint arXiv:2110.14883},
  year={2021}
}
```

<p align="right">(<a href="#top">back to top</a>)</p>
fixed some typos in the documents, added blog link and paper author information in README 3 years ago			`# Colossal-AI`
update README and images path (#384) 3 years ago			`<div id="top" align="center">`
removed tutorial markdown and refreshed rst files for consistency 3 years ago
update README and images path (#384) 3 years ago			`[![logo](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Colossal-AI_logo.png)](https://www.colossalai.org/)`

[NFC] add inference (#1044) 3 years ago			`Colossal-AI: A Unified Deep Learning System for Big Model Era`
removed tutorial markdown and refreshed rst files for consistency 3 years ago
fixed utils docstring and add example to readme (#200) 3 years ago			`<h3> <a href="https://arxiv.org/abs/2110.14883"> Paper </a> \|`
			`<a href="https://www.colossalai.org/"> Documentation </a> \|`
			`<a href="https://github.com/hpcaitech/ColossalAI-Examples"> Examples </a> \|`
			`<a href="https://github.com/hpcaitech/ColossalAI/discussions"> Forum </a> \|`
update README and images path (#384) 3 years ago			`<a href="https://medium.com/@hpcaitech"> Blog </a></h3>`
updated readme and change log (#224) 3 years ago
fixed broken badge link 3 years ago			`[![Build](https://github.com/hpcaitech/ColossalAI/actions/workflows/build.yml/badge.svg)](https://github.com/hpcaitech/ColossalAI/actions/workflows/build.yml)`
Update workflow files and README.md (#166) 3 years ago			`[![Documentation](https://readthedocs.org/projects/colossalai/badge/?version=latest)](https://colossalai.readthedocs.io/en/latest/?badge=latest)`
[misc] replace codebeat with codefactor on readme (#436) 3 years ago			`[![CodeFactor](https://www.codefactor.io/repository/github/hpcaitech/colossalai/badge)](https://www.codefactor.io/repository/github/hpcaitech/colossalai)`
update hf badge link (#410) 3 years ago			`[![HuggingFace badge](https://img.shields.io/badge/%F0%9F%A4%97HuggingFace-Join-yellow)](https://huggingface.co/hpcai-tech)`
add badge and contributor list 3 years ago			`[![slack badge](https://img.shields.io/badge/Slack-join-blueviolet?logo=slack&amp)](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w)`
update README and images path (#384) 3 years ago			`[![WeChat badge](https://img.shields.io/badge/微信-加入-green?logo=wechat&amp)](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png)`
update hf badge link (#410) 3 years ago
add Chinese README 3 years ago
			`\| [English](README.md) \| [中文](README-zh-Hans.md) \|`
update README and images path (#384) 3 years ago
add logo at homepage, add forum in issue template (#161) 3 years ago			`</div>`
update documentation 3 years ago
update README and images path (#384) 3 years ago			`## Table of Contents`
			`<ul>`
add video (#732) 3 years ago			`<li><a href="#Why-Colossal-AI">Why Colossal-AI</a> </li>`
update README and images path (#384) 3 years ago			`<li><a href="#Features">Features</a> </li>`
			`<li>`
[NFC] add inference (#1044) 3 years ago			`<a href="#Parallel-Training-Demo">Parallel Training Demo</a>`
update README and images path (#384) 3 years ago			`<ul>`
			`<li><a href="#ViT">ViT</a></li>`
			`<li><a href="#GPT-3">GPT-3</a></li>`
			`<li><a href="#GPT-2">GPT-2</a></li>`
			`<li><a href="#BERT">BERT</a></li>`
add PaLM link (#704) (#705) 3 years ago			`<li><a href="#PaLM">PaLM</a></li>`
[NFC] add OPT (#1345) 2 years ago			`<li><a href="#OPT">OPT</a></li>`
update README and images path (#384) 3 years ago			`</ul>`
			`</li>`
update results on a single GPU, highlight quick view (#981) 3 years ago			`<li>`
[NFC] add inference (#1044) 3 years ago			`<a href="#Single-GPU-Training-Demo">Single GPU Training Demo</a>`
update results on a single GPU, highlight quick view (#981) 3 years ago			`<ul>`
			`<li><a href="#GPT-2-Single">GPT-2</a></li>`
			`<li><a href="#PaLM-Single">PaLM</a></li>`
			`</ul>`
			`</li>`
[NFC] add inference (#1044) 3 years ago			`<li>`
add inference submodule (#1047) 3 years ago			`<a href="#Inference-Energon-AI-Demo">Inference (Energon-AI) Demo</a>`
[NFC] add inference (#1044) 3 years ago			`<ul>`
			`<li><a href="#GPT-3-Inference">GPT-3</a></li>`
			`</ul>`
[doc] update readme with the new xTrimoMultimer project (#1477) * update xTrimoMultimer project * update xTrimoMultimer project * latest update of xTrimoMultimer project info 2 years ago			`</li>`
			`<li>`
			`<a href="#Colossal-AI-in-the-Real-World">Colossal-AI in the Real World</a>`
			`<ul>`
			`<li><a href="#xTrimoMultimer">xTrimoMultimer: Accelerating Protein Monomer and Multimer Structure Prediction</a></li>`
			`</ul>`
[NFC] add inference (#1044) 3 years ago			`</li>`
update README and images path (#384) 3 years ago			`<li>`
			`<a href="#Installation">Installation</a>`
			`<ul>`
			`<li><a href="#PyPI">PyPI</a></li>`
			`<li><a href="#Install-From-Source">Install From Source</a></li>`
			`</ul>`
			`</li>`
			`<li><a href="#Use-Docker">Use Docker</a></li>`
			`<li><a href="#Community">Community</a></li>`
			`<li><a href="#contributing">Contributing</a></li>`
			`<li><a href="#Quick-View">Quick View</a></li>`
			`<ul>`
			`<li><a href="#Start-Distributed-Training-in-Lines">Start Distributed Training in Lines</a></li>`
			`<li><a href="#Write-a-Simple-2D-Parallel-Model">Write a Simple 2D Parallel Model</a></li>`
			`</ul>`
			`<li><a href="#Cite-Us">Cite Us</a></li>`
			`</ul>`
add Chinese README 3 years ago
add video (#732) 3 years ago			`## Why Colossal-AI`
			`<div align="center">`
			`<a href="https://youtu.be/KnXSfjqkKN0">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/JamesDemmel_Colossal-AI.png" width="600" />`
			`</a>`

Update README.md 2 years ago			`Prof. James Demmel (UC Berkeley): Colossal-AI makes training AI models efficient, easy, and scalable.`
add video (#732) 3 years ago			`</div>`

			`<p align="right">(<a href="#top">back to top</a>)</p>`

add Chinese README 3 years ago			`## Features`

[NFC] add inference (#1044) 3 years ago			`Colossal-AI provides a collection of parallel components for you. We aim to support you to write your`
Update README.md (#514) 3 years ago			`distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart`
[NFC] add inference (#1044) 3 years ago			`distributed training and inference in a few lines.`
add Chinese README 3 years ago
[readme] polish readme (#764) * [readme] polish readme * centering image 3 years ago			`- Parallelism strategies`
			`- Data Parallelism`
			`- Pipeline Parallelism`
[readme] sync CN readme (#766) 3 years ago			`- 1D, [2D](https://arxiv.org/abs/2104.05343), [2.5D](https://arxiv.org/abs/2105.14500), [3D](https://arxiv.org/abs/2105.14450) Tensor Parallelism`
			`- [Sequence Parallelism](https://arxiv.org/abs/2105.13120)`
[NFC] fix paper link 3 years ago			`- [Zero Redundancy Optimizer (ZeRO)](https://arxiv.org/abs/1910.02054)`
[readme] polish readme (#764) * [readme] polish readme * centering image 3 years ago
Update README.md 2 years ago			`- Heterogeneous Memory Management`
[readme] polish readme (#764) * [readme] polish readme * centering image 3 years ago			`- [PatrickStar](https://arxiv.org/abs/2108.05818)`

			`- Friendly Usage`
[readme] sync CN readme (#766) 3 years ago			`- Parallelism based on configuration file`
add Chinese README 3 years ago
[NFC] add inference (#1044) 3 years ago			`- Inference`
			`- [Energon-AI](https://github.com/hpcaitech/EnergonAI)`

[doc] update readme with the new xTrimoMultimer project (#1477) * update xTrimoMultimer project * update xTrimoMultimer project * latest update of xTrimoMultimer project info 2 years ago			`- Colossal-AI in the Real World`
			`- [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): Accelerating Protein Monomer and Multimer Structure Prediction`
update README and images path (#384) 3 years ago			`<p align="right">(<a href="#top">back to top</a>)</p>`

[NFC] add inference (#1044) 3 years ago			`## Parallel Training Demo`
add Chinese README 3 years ago			`### ViT`
[readme] polish readme (#764) * [readme] polish readme * centering image 3 years ago			`<p align="center">`
Fix/format (#366) 3 years ago			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/ViT.png" width="450" />`
[readme] polish readme (#764) * [readme] polish readme * centering image 3 years ago			`</p>`
add Chinese README 3 years ago
Update README.md (#514) 3 years ago			`- 14x larger batch size, and 5x faster training for Tensor Parallelism = 64`
add Chinese README 3 years ago
update experimental visualization (#253) 3 years ago			`### GPT-3`
[readme] polish readme (#764) * [readme] polish readme * centering image 3 years ago			`<p align="center">`
update GPT-3 visualisation 2 years ago			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT3-v5.png" width=700/>`
[readme] polish readme (#764) * [readme] polish readme * centering image 3 years ago			`</p>`
add Chinese README 3 years ago
Update README.md (#514) 3 years ago			`- Save 50% GPU resources, and 10.7% acceleration`
update experimental visualization (#253) 3 years ago
			`### GPT-2`
Fix/format (#366) 3 years ago			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2.png" width=800/>`
update experimental visualization (#253) 3 years ago
Update README.md (#514) 3 years ago			`- 11x lower GPU memory consumption, and superlinear scaling efficiency with Tensor Parallelism`
update experimental visualization (#253) 3 years ago
update GPT-2 experiment result (#666) 3 years ago			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/(updated)GPT-2.png" width=800>`
Update Experiment result about Colossal-AI with ZeRO (#479) * [readme] add experimental visualisation regarding ColossalAI with ZeRO (#476) * Hotfix/readme (#478) * add experimental visualisation regarding ColossalAI with ZeRO * adjust newly-added figure size 3 years ago
update GPT-2 experiment result (#666) 3 years ago			`- 24x larger model size on the same hardware`
			`- over 3x acceleration`
add Chinese README 3 years ago			`### BERT`
Fix/format (#366) 3 years ago			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/BERT.png" width=800/>`
add Chinese README 3 years ago
add community group and update issue template(#271) 3 years ago			`- 2x faster training, or 50% longer sequence length`
add Chinese README 3 years ago
add PaLM link (#704) * add PaLM link 3 years ago			`### PaLM`
			`- [PaLM-colossalai](https://github.com/hpcaitech/PaLM-colossalai): Scalable implementation of Google's Pathways Language Model ([PaLM](https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html)).`

[NFC] add OPT (#1345) 2 years ago			`### OPT`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/OPT.png" width=800/>`

			`- [Open Pretrained Transformer (OPT)](https://github.com/facebookresearch/metaseq), a 175-Billion parameter AI language model released by Meta, which stimulates AI programmers to perform various downstream tasks and application deployments because public pretrained model weights.`
			`- 40% speedup fine-tuning OPT at low cost in lines. [[Example]](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/opt)`

			`Please visit our [documentation](https://www.colossalai.org/) and [examples](https://github.com/hpcaitech/ColossalAI-Examples) for more details.`
add Chinese README 3 years ago
update README and images path (#384) 3 years ago			`<p align="right">(<a href="#top">back to top</a>)</p>`
add Chinese README 3 years ago
[NFC] add inference (#1044) 3 years ago			`## Single GPU Training Demo`
update results on a single GPU, highlight quick view (#981) 3 years ago
			`### GPT-2`
			`<p id="GPT-2-Single" align="center">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2-GPU1.png" width=450/>`
			`</p>`

			`- 20x larger model size on the same hardware`

update nvme on readme (#1397) 2 years ago			`<p id="GPT-2-NVME" align="center">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT2-NVME.png" width=800/>`
			`</p>`

			`- 120x larger model size on the same hardware (RTX 3080)`

update results on a single GPU, highlight quick view (#981) 3 years ago			`### PaLM`
			`<p id="PaLM-Single" align="center">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/PaLM-GPU1.png" width=450/>`
			`</p>`

			`- 34x larger model size on the same hardware`

			`<p align="right">(<a href="#top">back to top</a>)</p>`

[NFC] add inference (#1044) 3 years ago
add inference submodule (#1047) 3 years ago			`## Inference (Energon-AI) Demo`
[NFC] add inference (#1044) 3 years ago
			`### GPT-3`
			`<p id="GPT-3-Inference" align="center">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/inference_GPT-3.jpg" width=800/>`
			`</p>`

			`- [Energon-AI](https://github.com/hpcaitech/EnergonAI): 50% inference acceleration on the same hardware`

			`<p align="right">(<a href="#top">back to top</a>)</p>`

[doc] update readme with the new xTrimoMultimer project (#1477) * update xTrimoMultimer project * update xTrimoMultimer project * latest update of xTrimoMultimer project info 2 years ago			`## Colossal-AI in the Real World`

			`### xTrimoMultimer: Accelerating Protein Monomer and Multimer Structure Prediction`
			`<p id="xTrimoMultimer" align="center">`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/xTM_Prediction.jpg" width=380/>`
			`<p></p>`
			`<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/xTrimoMultimer_Table.jpg" width=800/>`
			`</p>`

			`- [xTrimoMultimer](https://github.com/biomap-research/xTrimoMultimer): accelerating structure prediction of protein monomers and multimer by 11x`

			`<p align="right">(<a href="#top">back to top</a>)</p>`

Migrated project 3 years ago			`## Installation`

update results on a single GPU, highlight quick view (#981) 3 years ago			`### Download From Official Releases`
update setup and workflow (#222) 3 years ago
fix download link (#998) 3 years ago			`You can visit the [Download](https://www.colossalai.org/download) page to download Colossal-AI with pre-built CUDA extensions.`
update examples and sphnix docs for the new api (#63) 3 years ago
update setup and workflow (#222) 3 years ago
update results on a single GPU, highlight quick view (#981) 3 years ago			`### Download From Source`
update setup and workflow (#222) 3 years ago
update results on a single GPU, highlight quick view (#981) 3 years ago			`> The version of Colossal-AI will be in line with the main branch of the repository. Feel free to raise an issue if you encounter any problem. :)`
Migrated project 3 years ago
			```shell
update examples and sphnix docs for the new api (#63) 3 years ago			`git clone https://github.com/hpcaitech/ColossalAI.git`
Migrated project 3 years ago			`cd ColossalAI`
update results on a single GPU, highlight quick view (#981) 3 years ago
Migrated project 3 years ago			`# install dependency`
			`pip install -r requirements/requirements.txt`

			`# install colossalai`
			`pip install .`
			```

update setup and workflow (#222) 3 years ago			`If you don't want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer):`
Migrated project 3 years ago
			```shell
update results on a single GPU, highlight quick view (#981) 3 years ago			`NO_CUDA_EXT=1 pip install .`
Migrated project 3 years ago			```

update README and images path (#384) 3 years ago			`<p align="right">(<a href="#top">back to top</a>)</p>`
add badge and contributor list 3 years ago
added docker documentation (#152) 3 years ago			`## Use Docker`

[workflow] polish readme and dockerfile (#1165) * [workflow] polish readme and dockerfile * polish 2 years ago			`### Pull from DockerHub`

			`You can directly pull the docker image from our [DockerHub page](https://hub.docker.com/r/hpcaitech/colossalai). The image is automatically uploaded upon release.`


			`### Build On Your Own`

added docker documentation (#152) 3 years ago			`Run the following command to build a docker image from Dockerfile provided.`

[doc] update docker instruction (#1020) 3 years ago			> Building Colossal-AI from scratch requires GPU support, you need to use Nvidia Docker Runtime as the default when doing `docker build`. More details can be found [here](https://stackoverflow.com/questions/59691207/docker-build-with-nvidia-runtime).
			`> We recommend you install Colossal-AI from our [project page](https://www.colossalai.org) directly.`

[workflow] polish readme and dockerfile (#1165) * [workflow] polish readme and dockerfile * polish 2 years ago
added docker documentation (#152) 3 years ago			```bash
			`cd ColossalAI`
			`docker build -t colossalai ./docker`
			```

			`Run the following command to start the docker container in interactive mode.`

			```bash
			`docker run -ti --gpus all --rm --ipc=host colossalai bash`
			```

update README and images path (#384) 3 years ago			`<p align="right">(<a href="#top">back to top</a>)</p>`
add badge and contributor list 3 years ago
			`## Community`

			`Join the Colossal-AI community on [Forum](https://github.com/hpcaitech/ColossalAI/discussions),`
			`[Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w),`
Update README.md (#514) 3 years ago			`and [WeChat](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png "qrcode") to share your suggestions, feedback, and questions with our engineering team.`
add badge and contributor list 3 years ago
updated readme and change log (#224) 3 years ago			`## Contributing`

add badge and contributor list 3 years ago			`If you wish to contribute to this project, please follow the guideline in [Contributing](./CONTRIBUTING.md).`

			`Thanks so much to all of our amazing contributors!`
updated readme and change log (#224) 3 years ago
add badge and contributor list 3 years ago			`<a href="https://github.com/hpcaitech/ColossalAI/graphs/contributors"><img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/contributor_avatar.png" width="800px"></a>`

			`The order of contributor avatars is randomly shuffled.`
updated readme and change log (#224) 3 years ago
update README and images path (#384) 3 years ago			`<p align="right">(<a href="#top">back to top</a>)</p>`

Migrated project 3 years ago			`## Quick View`

			`### Start Distributed Training in Lines`

			```python
update results on a single GPU, highlight quick view (#981) 3 years ago			`parallel = dict(`
			`pipeline=2,`
			`tensor=dict(mode='2.5d', depth = 1, size=4)`
update markdown docs (english) (#60) 3 years ago			`)`
Migrated project 3 years ago			```

update results on a single GPU, highlight quick view (#981) 3 years ago			`### Start Heterogeneous Training in Lines`
Migrated project 3 years ago
			```python
update results on a single GPU, highlight quick view (#981) 3 years ago			`zero = dict(`
			`model_config=dict(`
			`tensor_placement_policy='auto',`
			`shard_strategy=TensorShardStrategy(),`
			`reuse_fp16_shard=True`
			`),`
			`optimizer_config=dict(initial_scale=2**5, gpu_margin_mem_ratio=0.2)`
			`)`
Migrated project 3 years ago
			```

update README and images path (#384) 3 years ago			`<p align="right">(<a href="#top">back to top</a>)</p>`
Migrated project 3 years ago
fixed some typos in the documents, added blog link and paper author information in README 3 years ago			`## Cite Us`
Migrated project 3 years ago
fixed some typos in the documents, added blog link and paper author information in README 3 years ago			```
			`@article{bian2021colossal,`
			`title={Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training},`
			`author={Bian, Zhengda and Liu, Hongxin and Wang, Boxiang and Huang, Haichen and Li, Yongbin and Wang, Chuanrui and Cui, Fan and You, Yang},`
			`journal={arXiv preprint arXiv:2110.14883},`
			`year={2021}`
			`}`
			```
update README and images path (#384) 3 years ago
Update README.md 2 years ago			`<p align="right">(<a href="#top">back to top</a>)</p>`