ColossalAI/colossalai/inference/core
yuehuayingxueluo f79963199c
[inference]Add alibi to flash attn function (#5678)
* add alibi to flash attn function

* rm redundant modifications
2024-04-30 19:35:05 +08:00
..
__init__.py [doc] updated inference readme (#5343) 2024-02-02 14:31:10 +08:00
engine.py [inference]Add alibi to flash attn function (#5678) 2024-04-30 19:35:05 +08:00
plugin.py [Feat]Tensor Model Parallel Support For Inference (#5563) 2024-04-18 16:56:46 +08:00
request_handler.py [Fix/Inference] Fix GQA Triton and Support Llama3 (#5624) 2024-04-23 13:09:55 +08:00