mirror of https://github.com/hpcaitech/ColossalAI
![]() * [Inference] Add KVCache Manager * function refactored * add test for KVCache Manager * add attr beam width * Revise alloc func in CacheManager * Fix docs and pytests * add tp slicing for head number * optimize shapes of tensors used as physical cache * Apply using InferenceConfig on KVCacheManager * rm duplicate config file * Optimize cache allocation: use contiguous cache * Fix config in pytest (and config) |
||
---|---|---|
.. | ||
_utils.py | ||
test_config_and_struct.py | ||
test_kvcache_manager.py |