mirror of https://github.com/hpcaitech/ColossalAI
![]() * revise shape of kvcache (context attn kernel) * revise shape of kvcache (flash decoding kernel) * revise shape of kvcache (kvcache copy) and attn func * init of kvcache in kvcache manager * revise llama modeling * revise block size retrieval * use torch for rms_norm benchmarking * revise block size retrieval |
||
---|---|---|
.. | ||
test_attention.py |