ColossalAI/colossalai/inference/modeling
yuehuayingxueluo 86b63f720c
[Inference]Adapted to the triton attn kernels (#5264)
* adapted to the triton attn kernels

* fix pad input

* adapted to copy_kv_to_blocked_cache

* fix ci test

* update kv memcpy

* remove print
2024-01-17 16:03:10 +08:00
..
layers [Inference]Adapted to the triton attn kernels (#5264) 2024-01-17 16:03:10 +08:00
models [Inference]Adapted to the triton attn kernels (#5264) 2024-01-17 16:03:10 +08:00
policy Fixed a bug in the inference frame 2024-01-11 13:39:56 +00:00