Commit Graph

130 Commits (c1fab951e786d8322cc3cd0d9e73c527361e13ca)

Author SHA1 Message Date
Camille Zhong cd6a962e66 [NFC] polish code style (#4799)
1 year ago
littsk eef96e0877 polish code for gptq (#4793)
1 year ago
Jianghai 013a4bedf0
[inference]fix import bug and delete down useless init (#4830)
1 year ago
Xu Kai c3bef20478
add autotune (#4822)
1 year ago
Jianghai ce7ade3882
[inference] chatglm2 infer demo (#4724)
1 year ago
Xu Kai 946ab56c48
[feature] add gptq for inference (#4754)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Xuanlei Zhao 32e7f99416
[kernel] update triton init #4740 (#4740)
1 year ago
Yuanheng Zhao e2c0e7f92a
[hotfix] Fix import error: colossal.kernel without triton installed (#4722)
1 year ago
Cuiqing Li bce0f16702
[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577)
1 year ago
Hongxin Liu 554aa9592e
[legacy] move communication and nn to legacy and refactor logger (#4671)
1 year ago
Hongxin Liu 0b00def881
[example] add llama2 example (#4527)
1 year ago
flybird1111 7a3dfd0c64 [shardformer] update shardformer to use flash attention 2 (#4392)
1 year ago
flybird1111 38b792aab2
[coloattention] fix import error (#4380)
1 year ago
flybird1111 25c57b9fb4
[fix] coloattention support flash attention 2 (#4347)
1 year ago
Cuiqing Li 4b977541a8
[Kernels] added triton-implemented of self attention for colossal-ai (#4241)
1 year ago
digger yu 8abc87798f
fix Tensor is not defined (#4129)
1 year ago
Hongxin Liu ae02d4e4f7
[bf16] add bf16 support (#3882)
1 year ago
digger yu 70c8cdecf4
[nfc] fix typo colossalai/cli fx kernel (#3847)
2 years ago
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618)
2 years ago
zbian 7bc0afc901 updated flash attention usage
2 years ago
Frank Lee 95a36eae63
[kernel] added kernel loader to softmax autograd function (#3093)
2 years ago
ver217 823f3b9cf4
[doc] add deepspeed citation and copyright (#2996)
2 years ago
ver217 090f14fd6b
[misc] add reference (#2930)
2 years ago
Frank Lee 918bc94b6b
[triton] added copyright information for flash attention (#2835)
2 years ago
Frank Lee dd14783f75
[kernel] fixed repeated loading of kernels (#2549)
2 years ago
Frank Lee 8b7495dd54
[example] integrate seq-parallel tutorial with CI (#2463)
2 years ago
jiaruifang 69d9180c4b [hotfix] issue #2388
2 years ago
Frank Lee 40d376c566
[setup] support pre-build and jit-build of cuda kernels (#2374)
2 years ago
Jiarui Fang db6eea3583
[builder] reconfig op_builder for pypi install (#2314)
2 years ago
Jiarui Fang 16cc8e6aa7
[builder] MOE builder (#2277)
2 years ago
xcnick 85178a397a
[hotfix] fix error for torch 2.0 (#2243)
2 years ago
Jiarui Fang db4cbdc7fb
[builder] builder for scaled_upper_triang_masked_softmax (#2234)
2 years ago
Jiarui Fang 54de05da5d
[builder] polish builder with better base class (#2216)
2 years ago
Jiarui Fang 7675792100
[builder] raise Error when CUDA_HOME is not set (#2213)
2 years ago
Jiarui Fang 1cb532ffec
[builder] multihead attn runtime building (#2203)
2 years ago
Jiarui Fang 5682e6d346
[hotfix] correcnt cpu_optim runtime compilation (#2197)
2 years ago
Jiarui Fang 355ffb386e
[builder] unified cpu_optim fused_optim inferface (#2190)
2 years ago
Jiarui Fang bc0e271e71
[buider] use builder() for cpu adam and fused optim in setup.py (#2187)
2 years ago
Jiarui Fang d42afd30f8
[builder] runtime adam and fused_optim builder (#2184)
2 years ago
アマデウス 077a66dd81
updated attention kernel (#2133)
2 years ago
HELSON e7d3afc9cc
[optimizer] add div_scale for optimizers (#2117)
2 years ago
ver217 f8a7148dec
[kernel] move all symlinks of kernel to `colossalai._C` (#1971)
2 years ago
zbian 6877121377 updated flash attention api
2 years ago
アマデウス 4268ae017b
[kernel] added jit warmup (#1792)
2 years ago
xcnick e0da01ea71
[hotfix] fix build error when torch version >= 1.13 (#1803)
2 years ago
oahzxl 9639ea88fc
[kernel] more flexible flashatt interface (#1804)
2 years ago
oahzxl 501a9e9cd2
[hotfix] polish flash attention (#1802)
2 years ago
Jiarui Fang c248800359
[kernel] skip tests of flash_attn and triton when they are not available (#1798)
2 years ago
oahzxl 25952b67d7
[feat] add flash attention (#1762)
2 years ago