ColossalAI/colossalai/inference/modeling/models
yuehuayingxueluo f79963199c
[inference]Add alibi to flash attn function (#5678)
* add alibi to flash attn function

* rm redundant modifications
2024-04-30 19:35:05 +08:00
..
__init__.py fix bugs in request_handler 2024-01-11 13:39:56 +00:00
glide_llama.py [Inference/SpecDec] Support GLIDE Drafter Model (#5455) 2024-04-10 11:07:52 +08:00
nopadding_baichuan.py [inference]Add alibi to flash attn function (#5678) 2024-04-30 19:35:05 +08:00
nopadding_llama.py [Inference/Kernel] refactor kvcache manager and rotary_embedding and kvcache_memcpy oper… (#5663) 2024-04-30 15:52:23 +08:00