ColossalAI

Commit Graph

Author	SHA1	Message	Date
Hongxin Liu	e5ce4c8ea6	[npu] add npu support for gemini and zero (#5067 ) * [npu] setup device utils (#5047) * [npu] add npu device support * [npu] support low level zero * [test] update npu zero plugin test * [hotfix] fix import * [test] recover tests * [npu] gemini support npu (#5052) * [npu] refactor device utils * [gemini] support npu * [example] llama2+gemini support npu * [kernel] add arm cpu adam kernel (#5065) * [kernel] add arm cpu adam * [optim] update adam optimizer * [kernel] arm cpu adam remove bf16 support	1 year ago
アマデウス	4e4a10c97d	updated c++17 compiler flags (#4983 )	1 year ago
Xu Kai	611a5a80ca	[inference] Add smmoothquant for llama (#4904 ) * [inference] add int8 rotary embedding kernel for smoothquant (#4843) * [inference] add smoothquant llama attention (#4850) * add smoothquant llama attention * remove uselss code * remove useless code * fix import error * rename file name * [inference] add silu linear fusion for smoothquant llama mlp (#4853) * add silu linear * update skip condition * catch smoothquant cuda lib exception * prcocess exception for tests * [inference] add llama mlp for smoothquant (#4854) * add llama mlp for smoothquant * fix down out scale * remove duplicate lines * add llama mlp check * delete useless code * [inference] add smoothquant llama (#4861) * add smoothquant llama * fix attention accuracy * fix accuracy * add kv cache and save pretrained * refactor example * delete smooth * refactor code * [inference] add smooth function and delete useless code for smoothquant (#4895) * add smooth function and delete useless code * update datasets * remove duplicate import * delete useless file * refactor codes (#4902) * rafactor code * add license * add torch-int and smoothquant license	1 year ago
Tong Li	bbbcac26e8	fix format (#4815 )	1 year ago
Xu Kai	946ab56c48	[feature] add gptq for inference (#4754 ) * [gptq] add gptq kernel (#4416) * add gptq * refactor code * fix tests * replace auto-gptq * rname inferance/quant * refactor test * add auto-gptq as an option * reset requirements * change assert and check auto-gptq * add import warnings * change test flash attn version * remove example * change requirements of flash_attn * modify tests * [skip ci] change requirements-test * [gptq] faster gptq cuda kernel (#4494) * [skip ci] add cuda kernels * add license * [skip ci] fix max_input_len * format files & change test size * [skip ci] * [gptq] add gptq tensor parallel (#4538) * add gptq tensor parallel * add gptq tp * delete print * add test gptq check * add test auto gptq check * [gptq] combine gptq and kv cache manager (#4706) * combine gptq and kv cache manager * add init bits * delete useless code * add model path * delete usless print and update test * delete usless import * move option gptq to shard config * change replace linear to shardformer * update bloom policy * delete useless code * fix import bug and delete uselss code * change colossalai/gptq to colossalai/quant/gptq * update import linear for tests * delete useless code and mv gptq_kernel to kernel directory * fix triton kernel * add triton import	1 year ago
Hongxin Liu	079bf3cb26	[misc] update pre-commit and run all files (#4752 ) * [misc] update pre-commit * [misc] run pre-commit * [misc] remove useless configuration files * [misc] ignore cuda for clang-format	1 year ago
Mashiro	cfa607080f	[Fix] Fix compile error (#4357 )	1 year ago
Yanming W	269150b6f4	[Docker] Fix a couple of build issues (#3691 )	2 years ago
digger-yu	b7141c36dd	[CI] fix some spelling errors (#3707 ) * fix spelling error with examples/comminity/ * fix spelling error with tests/ * fix some spelling error with tests/ colossalai/ etc.	2 years ago
digger-yu	7570d9ae3d	[doc] fix op_builder/README.md (#3597 ) Optimization Code change "requries" to "requires"	2 years ago
digger-yu	d96567bb5d	[misc] op_builder/builder.py (#3593 ) Optimization Code The source code has not been modified, only a few spelling errors in the comments have been changed	2 years ago
Hongxin Liu	173dad0562	[misc] add verbose arg for zero and op builder (#3552 ) * [misc] add print verbose * [gemini] add print verbose * [zero] add print verbose for low level * [misc] add print verbose for op builder	2 years ago
digger-yu	77efdfe1dd	[doc] Update README.md (#3549 ) Format Optimization ,Add [] outside of DeepSpeed	2 years ago
digger-yu	3f760da9f0	Update README.md (#3548 ) Delete more ")"	2 years ago
ver217	823f3b9cf4	[doc] add deepspeed citation and copyright (#2996 ) * [doc] add deepspeed citation and copyright * [doc] add deepspeed citation and copyright * [doc] add deepspeed citation and copyright	2 years ago
Yasyf Mohamedali	19fa0e57f6	Remove extraneous comma (#2993 ) Prevents `TypeError: category must be a Warning subclass, not 'str'`.	2 years ago
Frank Lee	3a5d93bc2c	[kernel] cached the op kernel and fixed version check (#2886 ) * [kernel] cached the op kernel and fixed version check * polish code * polish code	2 years ago
Frank Lee	dd14783f75	[kernel] fixed repeated loading of kernels (#2549 ) * [kernel] fixed repeated loading of kernels * polish code * polish code	2 years ago
Frank Lee	40d376c566	[setup] support pre-build and jit-build of cuda kernels (#2374 ) * [setup] support pre-build and jit-build of cuda kernels * polish code * polish code * polish code * polish code * polish code * polish code	2 years ago
Frank Lee	8711310cda	[setup] remove torch dependency (#2333 )	2 years ago
Jiarui Fang	db6eea3583	[builder] reconfig op_builder for pypi install (#2314 )	2 years ago
Frank Lee	9b765e7a69	[setup] removed the build dependency on colossalai (#2307 )	2 years ago

22 Commits (d10ee42f68d090db17a8b87cac46ab6d1c2c8ca2)