ColossalAI/colossalai/inference/modeling/layers
Yuanheng Zhao 3da9993b0d
[Kernel/Fix] Revise flash attention triton kernel API and add benchmark (#5301)
* fix decoding kernel pytest

* revise and add triton context attn benchmark
2024-01-23 17:16:02 +08:00
..
attention.py [Kernel/Fix] Revise flash attention triton kernel API and add benchmark (#5301) 2024-01-23 17:16:02 +08:00