ColossalAI

History

Yuanheng Zhao 3de2e62299 [Inference] Add CacheBlock and KV-Cache Manager (#5156 ) * [Inference] Add KVCache Manager * function refactored * add test for KVCache Manager * add attr beam width * Revise alloc func in CacheManager * Fix docs and pytests * add tp slicing for head number * optimize shapes of tensors used as physical cache * Apply using InferenceConfig on KVCacheManager * rm duplicate config file * Optimize cache allocation: use contiguous cache * Fix config in pytest (and config)		2024-01-11 13:39:29 +00:00
..
_utils.py	[Hotfix] Fix model policy matching strategy in ShardFormer (#5064 )	2023-11-22 11:19:39 +08:00
test_config_and_struct.py	[Inference]Add BatchInferState, Sequence and InferConfig (#5149 )	2024-01-11 13:39:29 +00:00
test_kvcache_manager.py	[Inference] Add CacheBlock and KV-Cache Manager (#5156 )	2024-01-11 13:39:29 +00:00