You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ColossalAI/tests/test_infer/test_ops/cuda
yuehuayingxueluo 12f10d5b0b
[Fix/Inference]Fix CUDA Rotary Rmbedding GQA (#5623)
7 months ago
..
__init__.py [Inference]Add CUDA KVCache Kernel (#5406) 9 months ago
test_flash_decoding_attention.py [Fix/Inference] Fix GQA Triton and Support Llama3 (#5624) 7 months ago
test_get_cos_and_sin.py [Inference/Kernel]Add get_cos_and_sin Kernel (#5528) 8 months ago
test_kv_cache_memcpy.py [Inference]Support FP16/BF16 Flash Attention 2 And Add high_precision Flag To Rotary Embedding (#5461) 8 months ago
test_rms_layernorm.py feat baichuan2 rmsnorm whose hidden size equals to 5120 (#5611) 7 months ago
test_rotary_embdding_unpad.py [Fix/Inference]Fix CUDA Rotary Rmbedding GQA (#5623) 7 months ago
test_silu_and_mul.py add silu_and_mul for infer 9 months ago