ColossalAI

Making large AI models cheaper, faster and more accessible

ai big-model data-parallelism deep-learning distributed-computing foundation-models heterogeneous-training hpc inference large-scale model-parallelism pipeline-parallelism

History

Edenzzzz f5c84af0b0 [Feature] Zigzag Ring attention (#5905 ) * halfway * fix cross-PP-stage position id length diff bug * fix typo * fix typo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unified cross entropy func for all shardformer models * remove redundant lines * add basic ring attn; debug cross entropy * fwd bwd logic complete * fwd bwd logic complete; add experimental triton rescale * precision tests passed * precision tests passed * fix typos and remove misc files * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add sp_mode to benchmark; fix varlen interface * update softmax_lse shape by new interface * change tester name * remove buffer clone; support packed seq layout * add varlen tests * fix typo * all tests passed * add dkv_group; fix mask * remove debug statements --------- Co-authored-by: Edenzzzz <wtan45@wisc.edu> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>		3 months ago
..
README.md	[Feature] Zigzag Ring attention (#5905 )	3 months ago
args.py	[bug] fix get_default_parser in examples (#4764 )	1 year ago
data.py	[misc] update pre-commit and run all files (#4752 )	1 year ago
opt_benchmark.py	[Examples] Add lazy init to OPT and GPT examples (#5924 )	4 months ago
opt_train_demo.py	[Examples] Add lazy init to OPT and GPT examples (#5924 )	4 months ago
requirements.txt	[example] llama2 add fine-tune example (#4673 )	1 year ago
run_benchmark.sh	[bug] fix get_default_parser in examples (#4764 )	1 year ago
run_demo.sh	[bug] fix get_default_parser in examples (#4764 )	1 year ago
test_ci.sh	[bug] fix get_default_parser in examples (#4764 )	1 year ago

README.md

OPT

Meta recently released Open Pretrained Transformer (OPT), a 175-Billion parameter AI language model, which stimulates AI programmers to perform various downstream tasks and application deployments.

The following example of Colossal-AI demonstrates fine-tuning Causal Language Modelling at low cost.

Our Modifications

We are using the pre-training weights of the OPT model provided by Hugging Face Hub on the raw WikiText-2 (no tokens were replaced before the tokenization).

We adapt the OPT training code to ColossalAI by leveraging Boosting API loaded with a chosen plugin, where each plugin corresponds to a specific kind of training strategy. This example supports plugins including TorchDDPPlugin, LowLevelZeroPlugin, HybridParallelPlugin and GeminiPlugin.

Run Demo

By running the following script:

bash run_demo.sh

You will finetune a facebook/opt-350m model on this dataset, which contains more than 8000 comments on Netflix shows.

The script can be modified if you want to try another set of hyperparameters or change to another OPT model with different size.

The demo code is adapted from this blog and the HuggingFace Language Modelling examples.

Run Benchmark

You can run benchmark for OPT model by running the following script:

bash run_benchmark.sh

The script will test performance (throughput & peak memory usage) for each combination of hyperparameters. You can also play with this script to configure your set of hyperparameters for testing.