You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ColossalAI/colossalai/kernel/triton
yuehuayingxueluo 4f28cb43c0
[inference]Optimize the usage of the mid tensors space in flash attn (#5304)
10 months ago
..
__init__.py [inference]Optimize the usage of the mid tensors space in flash attn (#5304) 10 months ago
context_attn_unpad.py [inference]Optimize the usage of the mid tensors space in flash attn (#5304) 10 months ago
custom_autotune.py add autotune (#4822) 1 year ago
flash_decoding.py [inference]Optimize the usage of the mid tensors space in flash attn (#5304) 10 months ago
fused_rotary_embedding.py [Inference]Add fused rotary kernel and get cos cache kernel (#5302) 10 months ago
gptq_triton.py [inference] add reference and fix some bugs (#4937) 1 year ago
kvcache_copy.py [inference] Adapted to Rotary Embedding and RMS Norm (#5283) 10 months ago
llama_act_combine_kernel.py [moe] merge moe into main (#4978) 1 year ago
no_pad_rotary_embedding.py [Inference]Add fused rotary kernel and get cos cache kernel (#5302) 10 months ago
qkv_matmul_kernel.py [misc] update pre-commit and run all files (#4752) 1 year ago
rms_layernorm.py [kernel] Add RMSLayerNorm triton kernel (#5262) 10 months ago
rotary_cache_copy.py [Inference]Add fused rotary kernel and get cos cache kernel (#5302) 10 months ago
softmax.py [misc] update pre-commit and run all files (#4752) 1 year ago