mirror of https://github.com/hpcaitech/ColossalAI
[example] add llama pretraining (#4257)
parent
9a4842c571
commit
7ff11b5537
11
README.md
11
README.md
|
@ -25,6 +25,7 @@
|
|||
</div>
|
||||
|
||||
## Latest News
|
||||
* [2023/07] [65B Model Pretraining Accelerated by 38%, Best Practices for Building LLaMA-Like Base Models Open-Source](https://www.hpc-ai.tech/blog/large-model-pretraining)
|
||||
* [2023/03] [ColossalChat: An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b)
|
||||
* [2023/03] [Intel and Colossal-AI Partner to Deliver Cost-Efficient Open-Source Solution for Protein Folding Structure Prediction](https://www.hpc-ai.tech/blog/intel-habana)
|
||||
* [2023/03] [AWS and Google Fund Colossal-AI with Startup Cloud Programs](https://www.hpc-ai.tech/blog/aws-and-google-fund-colossal-ai-with-startup-cloud-programs)
|
||||
|
@ -49,6 +50,7 @@
|
|||
<li>
|
||||
<a href="#Parallel-Training-Demo">Parallel Training Demo</a>
|
||||
<ul>
|
||||
<li><a href="#LLaMA">LLaMA</a></li>
|
||||
<li><a href="#GPT-3">GPT-3</a></li>
|
||||
<li><a href="#GPT-2">GPT-2</a></li>
|
||||
<li><a href="#BERT">BERT</a></li>
|
||||
|
@ -216,6 +218,15 @@ Acceleration of [AlphaFold Protein Structure](https://alphafold.ebi.ac.uk/)
|
|||
|
||||
## Parallel Training Demo
|
||||
|
||||
### LLaMA
|
||||
<p align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/examples/images/LLaMA_pretraining.png" width=600/>
|
||||
</p>
|
||||
|
||||
- 65-billion-parameter large model pretraining accelerated by 38%
|
||||
[[code]](https://github.com/hpcaitech/ColossalAI/tree/example/llama/examples/language/llama)
|
||||
[[blog]](https://www.hpc-ai.tech/blog/large-model-pretraining)
|
||||
|
||||
### GPT-3
|
||||
<p align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/GPT3-v5.png" width=700/>
|
||||
|
|
|
@ -24,6 +24,7 @@
|
|||
</div>
|
||||
|
||||
## 新闻
|
||||
* [2023/07] [65B Model Pretraining Accelerated by 38%, Best Practices for Building LLaMA-Like Base Models Open-Source](https://www.hpc-ai.tech/blog/large-model-pretraining)
|
||||
* [2023/03] [ColossalChat: An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b)
|
||||
* [2023/03] [Intel and Colossal-AI Partner to Deliver Cost-Efficient Open-Source Solution for Protein Folding Structure Prediction](https://www.hpc-ai.tech/blog/intel-habana)
|
||||
* [2023/03] [AWS and Google Fund Colossal-AI with Startup Cloud Programs](https://www.hpc-ai.tech/blog/aws-and-google-fund-colossal-ai-with-startup-cloud-programs)
|
||||
|
@ -49,6 +50,7 @@
|
|||
<li>
|
||||
<a href="#并行训练样例展示">并行训练样例展示</a>
|
||||
<ul>
|
||||
<li><a href="#LLaMA">LLaMA</a></li>
|
||||
<li><a href="#GPT-3">GPT-3</a></li>
|
||||
<li><a href="#GPT-2">GPT-2</a></li>
|
||||
<li><a href="#BERT">BERT</a></li>
|
||||
|
@ -209,6 +211,14 @@ Colossal-AI 为您提供了一系列并行组件。我们的目标是让您的
|
|||
<p align="right">(<a href="#top">返回顶端</a>)</p>
|
||||
|
||||
## 并行训练样例展示
|
||||
### LLaMA
|
||||
<p align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/examples/images/LLaMA_pretraining.png" width=600/>
|
||||
</p>
|
||||
|
||||
- 650亿参数大模型预训练加速38%
|
||||
[[代码]](https://github.com/hpcaitech/ColossalAI/tree/example/llama/examples/language/llama)
|
||||
[[博客]](https://www.hpc-ai.tech/blog/large-model-pretraining)
|
||||
|
||||
### GPT-3
|
||||
<p align="center">
|
||||
|
|
|
@ -0,0 +1,11 @@
|
|||
# Pretraining LLaMA: best practices for building LLaMA-like base models
|
||||
|
||||
<p id="ColossalChat-Speed" align="center">
|
||||
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/examples/images/LLaMA_pretraining.png" width=600/>
|
||||
</p>
|
||||
|
||||
- 65-billion-parameter large model pretraining accelerated by 38%
|
||||
[[code]](https://github.com/hpcaitech/ColossalAI/tree/example/llama/examples/language/llama)
|
||||
[[blog]](https://www.hpc-ai.tech/blog/large-model-pretraining)
|
||||
|
||||
> Since the main branch is being updated, in order to maintain the stability of the code, this example is temporarily kept as an [independent branch](https://github.com/hpcaitech/ColossalAI/tree/example/llama/examples/language/llama).
|
Loading…
Reference in New Issue