ColossalAI/tests/test_infer
Yuanheng Zhao 3de2e62299 [Inference] Add CacheBlock and KV-Cache Manager (#5156)
* [Inference] Add KVCache Manager

* function refactored

* add test for KVCache Manager

* add attr beam width

* Revise alloc func in CacheManager

* Fix docs and pytests

* add tp slicing for head number

* optimize shapes of tensors used as physical cache

* Apply using InferenceConfig on KVCacheManager

* rm duplicate config file

* Optimize cache allocation: use contiguous cache

* Fix config in pytest (and config)
2024-01-11 13:39:29 +00:00
..
_utils.py [Hotfix] Fix model policy matching strategy in ShardFormer (#5064) 2023-11-22 11:19:39 +08:00
test_config_and_struct.py [Inference]Add BatchInferState, Sequence and InferConfig (#5149) 2024-01-11 13:39:29 +00:00
test_kvcache_manager.py [Inference] Add CacheBlock and KV-Cache Manager (#5156) 2024-01-11 13:39:29 +00:00