ColossalAI/colossalai/inference/core
yuehuayingxueluo 86b63f720c
[Inference]Adapted to the triton attn kernels (#5264)
* adapted to the triton attn kernels

* fix pad input

* adapted to copy_kv_to_blocked_cache

* fix ci test

* update kv memcpy

* remove print
2024-01-17 16:03:10 +08:00
..
engine.py [Inference]Adapted to the triton attn kernels (#5264) 2024-01-17 16:03:10 +08:00
request_handler.py [Inference]Adapted to the triton attn kernels (#5264) 2024-01-17 16:03:10 +08:00