ColossalAI/colossalai/inference/modeling/layers
yuehuayingxueluo 86b63f720c
[Inference]Adapted to the triton attn kernels (#5264)
* adapted to the triton attn kernels

* fix pad input

* adapted to copy_kv_to_blocked_cache

* fix ci test

* update kv memcpy

* remove print
2024-01-17 16:03:10 +08:00
..
attention.py [Inference]Adapted to the triton attn kernels (#5264) 2024-01-17 16:03:10 +08:00