ColossalAI

History

yuehuayingxueluo 6fb4bcbb24 [Inference/opt] Fused KVCahce Memcopy (#5374 ) * fused kv memcopy * add TODO in test_kvcache_copy.py		10 months ago
..
__init__.py	[inference]Optimize the usage of the mid tensors space in flash attn (#5304 )	10 months ago
context_attn_unpad.py	[Infer] Optimize Blocked KVCache And Kernels Using It (#5325 )	10 months ago
custom_autotune.py	add autotune (#4822 )	1 year ago
flash_decoding.py	[Inference]Fused the gate and up proj in mlp，and optimized the autograd process. (#5365 )	10 months ago
fused_rotary_embedding.py	[Inference]Fused the gate and up proj in mlp，and optimized the autograd process. (#5365 )	10 months ago
gptq_triton.py	[inference] add reference and fix some bugs (#4937 )	1 year ago
kvcache_copy.py	[Inference/opt] Fused KVCahce Memcopy (#5374 )	10 months ago
llama_act_combine_kernel.py	[moe] merge moe into main (#4978 )	1 year ago
no_pad_rotary_embedding.py	Revert "[Inference] Adapt to Fused rotary (#5348 )" (#5373 )	10 months ago
qkv_matmul_kernel.py	…
rms_layernorm.py	[Inference]Fused the gate and up proj in mlp，and optimized the autograd process. (#5365 )	10 months ago
rotary_cache_copy.py	[Inference]Fused the gate and up proj in mlp，and optimized the autograd process. (#5365 )	10 months ago
softmax.py	…