ColossalAI

History

Yuanheng Zhao 5f98a9d68a [Infer] Optimize Blocked KVCache And Kernels Using It (#5325 ) * revise shape of kvcache (context attn kernel) * revise shape of kvcache (flash decoding kernel) * revise shape of kvcache (kvcache copy) and attn func * init of kvcache in kvcache manager * revise llama modeling * revise block size retrieval * use torch for rms_norm benchmarking * revise block size retrieval		2024-01-30 16:06:09 +08:00
..
__init__.py	[Inference] Add CacheBlock and KV-Cache Manager (#5156 )	2024-01-11 13:39:29 +00:00
block_cache.py	[Inference] Add CacheBlock and KV-Cache Manager (#5156 )	2024-01-11 13:39:29 +00:00
kvcache_manager.py	[Infer] Optimize Blocked KVCache And Kernels Using It (#5325 )	2024-01-30 16:06:09 +08:00