bce0f16702
* [infer] Infer/llama demo (#4503)
* add
* add infer example
* finish
* finish
* stash
* fix
* [Kernels] add inference token attention kernel (#4505)
* add token forward
* fix tests
* fix comments
* add try import triton
* add adapted license
* add tests check
* [Kernels] add necessary kernels (llama & bloom) for attention forward and kv-cache manager (#4485)
* added _vllm_rms_norm
* change place
* added tests
* added tests
* modify
* adding kernels
* added tests:
* adding kernels
* modify
* added
* updating kernels
* adding tests
* added tests
* kernel change
* submit
* modify
* added
* edit comments
* change name
* change commnets and fix import
* add
* added
* combine codes (#4509)
* [feature] add KV cache manager for llama & bloom inference (#4495)
* add kv cache memory manager
* add stateinfo during inference
* format
* format
* rename file
* add kv cache test
* revise on BatchInferState
* file dir change
* [Bug FIx] import llama context ops fix (#4524)
* added _vllm_rms_norm
* change place
* added tests
* added tests
* modify
* adding kernels
* added tests:
* adding kernels
* modify
* added
* updating kernels
* adding tests
* added tests
* kernel change
* submit
* modify
* added
* edit comments
* change name
* change commnets and fix import
* add
* added
* fix
* add ops into init.py
* add
* [Infer] Add TPInferEngine and fix file path (#4532)
* add engine for TP inference
* move file path
* update path
* fix TPInferEngine
* remove unused file
* add engine test demo
* revise TPInferEngine
* fix TPInferEngine, add test
* fix
* Add Inference test for llama (#4508)
* add kv cache memory manager
* add stateinfo during inference
* add
* add infer example
* finish
* finish
* format
* format
* rename file
* add kv cache test
* revise on BatchInferState
* add inference test for llama
* fix conflict
* feature: add some new features for llama engine
* adapt colossalai triton interface
* Change the parent class of llama policy
* add nvtx
* move llama inference code to tensor_parallel
* fix __init__.py
* rm tensor_parallel
* fix: fix bugs in auto_policy.py
* fix:rm some unused codes
* mv colossalai/tpinference to colossalai/inference/tensor_parallel
* change __init__.py
* save change
* fix engine
* Bug fix: Fix hang
* remove llama_infer_engine.py
---------
Co-authored-by: yuanheng-zhao <jonathan.zhaoyh@gmail.com>
Co-authored-by: CjhHa1 <cjh18671720497@outlook.com>
* [infer] Add Bloom inference policy and replaced methods (#4512)
* add bloom inference methods and policy
* enable pass BatchInferState from model forward
* revise bloom infer layers/policies
* add engine for inference (draft)
* add test for bloom infer
* fix bloom infer policy and flow
* revise bloom test
* fix bloom file path
* remove unused codes
* fix bloom modeling
* fix dir typo
* fix trivial
* fix policy
* clean pr
* trivial fix
* Revert "[infer] Add Bloom inference policy and replaced methods (#4512)" (#4552)
This reverts commit
|
||
---|---|---|
.. | ||
community | ||
images | ||
inference | ||
language | ||
tutorial | ||
README.md |
README.md
Colossal-AI Examples
Table of Contents
Overview
This folder provides several examples accelerated by Colossal-AI.
Folders such as images
and language
include a wide range of deep learning tasks and applications.
The community
folder aim to create a collaborative platform for developers to contribute exotic features built on top of Colossal-AI.
The tutorial
folder is for everyone to quickly try out the different features in Colossal-AI.
You can find applications such as Chatbot, AIGC and Biomedicine in the Applications directory.
Folder Structure
└─ examples
└─ images
└─ vit
└─ test_ci.sh
└─ train.py
└─ README.md
└─ ...
└─ ...
Invitation to open-source contribution
Referring to the successful attempts of BLOOM and Stable Diffusion, any and all developers and partners with computing powers, datasets, models are welcome to join and build the Colossal-AI community, making efforts towards the era of big AI models!
You may contact us or participate in the following ways:
- Leaving a Star ⭐ to show your like and support. Thanks!
- Posting an issue, or submitting a PR on GitHub follow the guideline in Contributing.
- Join the Colossal-AI community on Slack, and WeChat(微信) to share your ideas.
- Send your official proposal to email contact@hpcaitech.com
Thanks so much to all of our amazing contributors!
Integrate Your Example With Testing
Regular checks are important to ensure that all examples run without apparent bugs and stay compatible with the latest API. Colossal-AI runs workflows to check for examples on a on-pull-request and weekly basis. When a new example is added or changed, the workflow will run the example to test whether it can run. Moreover, Colossal-AI will run testing for examples every week.
Therefore, it is essential for the example contributors to know how to integrate your example with the testing workflow. Simply, you can follow the steps below.
- Create a script called
test_ci.sh
in your example folder - Configure your testing parameters such as number steps, batch size in
test_ci.sh
, e.t.c. Keep these parameters small such that each example only takes several minutes. - Export your dataset path with the prefix
/data
and make sure you have a copy of the dataset in the/data/scratch/examples-data
directory on the CI machine. Community contributors can contact us via slack to request for downloading the dataset on the CI machine. - Implement the logic such as dependency setup and example execution
Community Dependency
We are happy to introduce the following nice community dependency repos that are powered by Colossal-AI: