ColossalAI

History

yuehuayingxueluo b45000f839 [Inference]Add Streaming LLM (#5745 ) * Add Streaming LLM * add some parameters to llama_generation.py * verify streamingllm config * add test_streamingllm.py * modified according to the opinions of review * add Citation * change _block_tables tolist		2024-06-05 10:51:19 +08:00
..
__init__.py	[doc] updated inference readme (#5343 )	2024-02-02 14:31:10 +08:00
async_engine.py	[Inference] Fix API server, test and example (#5712 )	2024-05-15 15:47:31 +08:00
engine.py	[Inference]Add Streaming LLM (#5745 )	2024-06-05 10:51:19 +08:00
plugin.py	[Feat]Tensor Model Parallel Support For Inference (#5563 )	2024-04-18 16:56:46 +08:00
request_handler.py	[Inference]Add Streaming LLM (#5745 )	2024-06-05 10:51:19 +08:00
rpc_engine.py	[Inference] Fix Inference Generation Config and Sampling (#5710 )	2024-05-19 15:08:42 +08:00