ColossalAI/colossalai/kernel/triton
Yuanheng Zhao 07b5283b6a [kernel] Add triton kernel for context attention (FAv2) without padding (#5192)
* add context attn unpadded triton kernel

* test compatibility

* kv cache copy (testing)

* fix k/v cache copy

* fix kv cache copy and test

* fix boundary of block ptrs

* add support for GQA/MQA and testing

* fix import statement

---------

Co-authored-by: Round Heng <yuanhengzhao@Rounds-MacBook-Pro.local>
2024-01-11 13:39:56 +00:00
..
__init__.py [kernel] Add triton kernel for context attention (FAv2) without padding (#5192) 2024-01-11 13:39:56 +00:00
context_attn_unpad.py [kernel] Add triton kernel for context attention (FAv2) without padding (#5192) 2024-01-11 13:39:56 +00:00
custom_autotune.py add autotune (#4822) 2023-09-28 13:47:35 +08:00
fused_layernorm.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
gptq_triton.py [inference] add reference and fix some bugs (#4937) 2023-10-20 13:39:34 +08:00
llama_act_combine_kernel.py [moe] merge moe into main (#4978) 2023-11-02 02:21:24 +00:00
qkv_matmul_kernel.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
softmax.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00