History

flybird11111 29695cf70c [example]add gpt2 benchmark example script. (#5295 ) * benchmark gpt2 * fix fix fix fix * [doc] fix typo in Colossal-LLaMA-2/README.md (#5247) * [workflow] fixed build CI (#5240) * [workflow] fixed build CI * polish * polish * polish * polish * polish * [ci] fixed booster test (#5251) * [ci] fixed booster test * [ci] fixed booster test * [ci] fixed booster test * [ci] fixed ddp test (#5254) * [ci] fixed ddp test * polish * fix typo in applications/ColossalEval/README.md (#5250) * [ci] fix shardformer tests. (#5255) * fix ci fix * revert: revert p2p * feat: add enable_metadata_cache option * revert: enable t5 tests --------- Co-authored-by: Wenhao Chen <cwher@outlook.com> * [doc] fix doc typo (#5256) * [doc] fix annotation display * [doc] fix llama2 doc * [hotfix]: add pp sanity check and fix mbs arg (#5268) * fix: fix misleading mbs arg * feat: add pp sanity check * fix: fix 1f1b sanity check * [workflow] fixed incomplete bash command (#5272) * [workflow] fixed oom tests (#5275) * [workflow] fixed oom tests * polish * polish * polish * [ci] fix test_hybrid_parallel_plugin_checkpoint_io.py (#5276) * fix ci fix * fix test * revert: revert p2p * feat: add enable_metadata_cache option * revert: enable t5 tests * fix --------- Co-authored-by: Wenhao Chen <cwher@outlook.com> * [shardformer] hybridparallelplugin support gradients accumulation. (#5246) * support gradients acc fix fix fix fix fix fix fix fix fix fix fix fix fix * fix fix * fix fix fix * [hotfix] Fix ShardFormer test execution path when using sequence parallelism (#5230) * fix auto loading gpt2 tokenizer (#5279) * [doc] add llama2-13B disyplay (#5285) * Update README.md * fix 13b typo --------- Co-authored-by: binmakeswell <binmakeswell@gmail.com> * fix llama pretrain (#5287) * fix * fix * fix fix * fix fix fix * fix fix * benchmark gpt2 * fix fix fix fix * [workflow] fixed build CI (#5240) * [workflow] fixed build CI * polish * polish * polish * polish * polish * [ci] fixed booster test (#5251) * [ci] fixed booster test * [ci] fixed booster test * [ci] fixed booster test * fix fix * fix fix fix * fix * fix fix fix fix fix * fix * Update shardformer.py --------- Co-authored-by: digger yu <digger-yu@outlook.com> Co-authored-by: Frank Lee <somerlee.9@gmail.com> Co-authored-by: Wenhao Chen <cwher@outlook.com> Co-authored-by: binmakeswell <binmakeswell@gmail.com> Co-authored-by: Zhongkai Zhao <kanezz620@gmail.com> Co-authored-by: Michelle <97082656+MichelleMa8@users.noreply.github.com> Co-authored-by: Desperado-Jia <502205863@qq.com>		2024-03-04 16:18:13 +08:00
..
community	[npu] change device to accelerator api (#5239 )	2024-01-09 10:20:05 +08:00
images	[npu] change device to accelerator api (#5239 )	2024-01-09 10:20:05 +08:00
inference	[npu] change device to accelerator api (#5239 )	2024-01-09 10:20:05 +08:00
language	[example]add gpt2 benchmark example script. (#5295 )	2024-03-04 16:18:13 +08:00
tutorial	Merge pull request #5310 from hpcaitech/feature/npu	2024-01-29 13:49:39 +08:00
README.md	[doc] update slack link (#4823 )	2023-09-27 17:37:39 +08:00
__init__.py	[example]add gpt2 benchmark example script. (#5295 )	2024-03-04 16:18:13 +08:00

README.md

Colossal-AI Examples

Colossal-AI Examples

Overview

This folder provides several examples accelerated by Colossal-AI. Folders such as images and language include a wide range of deep learning tasks and applications. The community folder aim to create a collaborative platform for developers to contribute exotic features built on top of Colossal-AI. The tutorial folder is for everyone to quickly try out the different features in Colossal-AI.

You can find applications such as Chatbot, AIGC and Biomedicine in the Applications directory.

Folder Structure

└─ examples
  └─ images
      └─ vit
        └─ test_ci.sh
        └─ train.py
        └─ README.md
      └─ ...
  └─ ...

Invitation to open-source contribution

Referring to the successful attempts of BLOOM and Stable Diffusion, any and all developers and partners with computing powers, datasets, models are welcome to join and build the Colossal-AI community, making efforts towards the era of big AI models!

You may contact us or participate in the following ways:

Leaving a Star ⭐ to show your like and support. Thanks!
Posting an issue, or submitting a PR on GitHub follow the guideline in Contributing.
Join the Colossal-AI community on Slack, and WeChat(微信) to share your ideas.
Send your official proposal to email contact@hpcaitech.com

Thanks so much to all of our amazing contributors!

Integrate Your Example With Testing

Regular checks are important to ensure that all examples run without apparent bugs and stay compatible with the latest API. Colossal-AI runs workflows to check for examples on a on-pull-request and weekly basis. When a new example is added or changed, the workflow will run the example to test whether it can run. Moreover, Colossal-AI will run testing for examples every week.

Therefore, it is essential for the example contributors to know how to integrate your example with the testing workflow. Simply, you can follow the steps below.

Create a script called test_ci.sh in your example folder
Configure your testing parameters such as number steps, batch size in test_ci.sh, e.t.c. Keep these parameters small such that each example only takes several minutes.
Export your dataset path with the prefix /data and make sure you have a copy of the dataset in the /data/scratch/examples-data directory on the CI machine. Community contributors can contact us via slack to request for downloading the dataset on the CI machine.
Implement the logic such as dependency setup and example execution

Community Dependency

We are happy to introduce the following nice community dependency repos that are powered by Colossal-AI: