149 Commits (e57812c6727e325971cb0d8769c0789c088f62ae)

Author SHA1 Message Date
flybird11111 148506c828
[coloattention]modify coloattention (#5627) 7 months ago
Hongxin Liu 641b1ee71a
[devops] remove post commit ci (#5566) 8 months ago
Hongxin Liu 19e1a5cf16
[shardformer] update colo attention to support custom mask (#5510) 8 months ago
Frank Lee 7cfed5f076
[feat] refactored extension module (#5298) 10 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239) 11 months ago
Xuanlei Zhao dd2c28a323
[npu] use extension for op builder (#5172) 11 months ago
Xuanlei Zhao d6df19bae7
[npu] support triangle attention for llama (#5130) 12 months ago
Jun Gao dce05da535
fix thrust-transform-reduce error (#5078) 1 year ago
Hongxin Liu e5ce4c8ea6
[npu] add npu support for gemini and zero (#5067) 1 year ago
Cuiqing Li (李崔卿) bce919708f
[Kernels]added flash-decoidng of triton (#5063) 1 year ago
Cuiqing Li (李崔卿) 28052a71fb
[Kernels]Update triton kernels into 2.1.0 (#5046) 1 year ago
Xuanlei Zhao dc003c304c
[moe] merge moe into main (#4978) 1 year ago
Cuiqing Li 459a88c806
[Kernels]Updated Triton kernels into 2.1.0 and adding flash-decoding for llama token attention (#4965) 1 year ago
Jianghai cf579ff46d
[Inference] Dynamic Batching Inference, online and offline (#4953) 1 year ago
Xu Kai 785802e809
[inference] add reference and fix some bugs (#4937) 1 year ago
Cuiqing Li 3a41e8304e
[Refactor] Integrated some lightllm kernels into token-attention (#4946) 1 year ago
Hongxin Liu 4f68b3f10c
[kernel] support pure fp16 for cpu adam and update gemini optim tests (#4921) 1 year ago
Xu Kai 611a5a80ca
[inference] Add smmoothquant for llama (#4904) 1 year ago
Xu Kai 77a9328304
[inference] add llama2 support (#4898) 1 year ago
Camille Zhong cd6a962e66 [NFC] polish code style (#4799) 1 year ago
littsk eef96e0877 polish code for gptq (#4793) 1 year ago
Jianghai 013a4bedf0
[inference]fix import bug and delete down useless init (#4830) 1 year ago
Xu Kai c3bef20478
add autotune (#4822) 1 year ago
Jianghai ce7ade3882
[inference] chatglm2 infer demo (#4724) 1 year ago
Xu Kai 946ab56c48
[feature] add gptq for inference (#4754) 1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752) 1 year ago
Xuanlei Zhao 32e7f99416
[kernel] update triton init #4740 (#4740) 1 year ago
Yuanheng Zhao e2c0e7f92a
[hotfix] Fix import error: colossal.kernel without triton installed (#4722) 1 year ago
Cuiqing Li bce0f16702
[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577) 1 year ago
Hongxin Liu 554aa9592e
[legacy] move communication and nn to legacy and refactor logger (#4671) 1 year ago
Hongxin Liu 0b00def881
[example] add llama2 example (#4527) 1 year ago
flybird1111 7a3dfd0c64 [shardformer] update shardformer to use flash attention 2 (#4392) 1 year ago
flybird1111 38b792aab2
[coloattention] fix import error (#4380) 1 year ago
flybird1111 25c57b9fb4
[fix] coloattention support flash attention 2 (#4347) 1 year ago
Cuiqing Li 4b977541a8
[Kernels] added triton-implemented of self attention for colossal-ai (#4241) 1 year ago
digger yu 8abc87798f
fix Tensor is not defined (#4129) 1 year ago
Hongxin Liu ae02d4e4f7
[bf16] add bf16 support (#3882) 1 year ago
digger yu 70c8cdecf4
[nfc] fix typo colossalai/cli fx kernel (#3847) 1 year ago
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618) 2 years ago
zbian 7bc0afc901 updated flash attention usage 2 years ago
Frank Lee 95a36eae63
[kernel] added kernel loader to softmax autograd function (#3093) 2 years ago
ver217 823f3b9cf4
[doc] add deepspeed citation and copyright (#2996) 2 years ago
ver217 090f14fd6b
[misc] add reference (#2930) 2 years ago
Frank Lee 918bc94b6b
[triton] added copyright information for flash attention (#2835) 2 years ago
Frank Lee dd14783f75
[kernel] fixed repeated loading of kernels (#2549) 2 years ago
Frank Lee 8b7495dd54
[example] integrate seq-parallel tutorial with CI (#2463) 2 years ago
jiaruifang 69d9180c4b [hotfix] issue #2388 2 years ago
Frank Lee 40d376c566
[setup] support pre-build and jit-build of cuda kernels (#2374) 2 years ago
Jiarui Fang db6eea3583
[builder] reconfig op_builder for pypi install (#2314) 2 years ago
Jiarui Fang 16cc8e6aa7
[builder] MOE builder (#2277) 2 years ago