mirror of https://github.com/hpcaitech/ColossalAI
![]() * prevent re-creating intermediate tensors * add singleton class holding intermediate values * fix triton kernel api * add benchmark in pytest * fix kernel api and add benchmark * revise flash decoding triton kernel in/out shapes * fix calling of triton kernel in modeling * fix pytest: extract to util functions |
||
---|---|---|
.. | ||
kernel_utils.py | ||
test_context_attn_unpad.py | ||
test_decoding_attn.py | ||
test_kvcache_copy.py | ||
test_llama_act_combine.py | ||
test_rmsnorm_triton.py | ||
test_rotary_embdding_unpad.py | ||
test_softmax.py |