ColossalAI/colossalai/inference/kv_cache
yuehuayingxueluo b45000f839
[Inference]Add Streaming LLM (#5745)
* Add Streaming LLM

* add some parameters to llama_generation.py

* verify streamingllm config

* add test_streamingllm.py

* modified according to the opinions of review

* add Citation

* change _block_tables tolist
2024-06-05 10:51:19 +08:00
..
__init__.py [Feat]Inference RPC Server Support (#5705) 2024-05-14 10:00:55 +08:00
block_cache.py [doc] updated inference readme (#5343) 2024-02-02 14:31:10 +08:00
kvcache_manager.py [Inference]Add Streaming LLM (#5745) 2024-06-05 10:51:19 +08:00