You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ColossalAI/colossalai/inference/kv_cache
Hongxin Liu 27e62ba0f7
[inference] decouple pp logic for llama (#5092)
1 year ago
..
__init__.py [inference] Refactor inference architecture (#5057) 1 year ago
batch_infer_state.py [inference] Refactor inference architecture (#5057) 1 year ago
kvcache_manager.py [inference] decouple pp logic for llama (#5092) 1 year ago