ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI

History

yuehuayingxueluo b45000f839 [Inference]Add Streaming LLM (#5745 ) * Add Streaming LLM * add some parameters to llama_generation.py * verify streamingllm config * add test_streamingllm.py * modified according to the opinions of review * add Citation * change _block_tables tolist		2024-06-05 10:51:19 +08:00
..
benchmark_ops	add paged-attetionv2: support seq length split across thread block (#5707 )	2024-05-14 12:46:54 +08:00
client	[Inference]Fix readme and example for API server (#5742 )	2024-05-24 10:03:05 +08:00
llama	[Inference]Add Streaming LLM (#5745 )	2024-06-05 10:51:19 +08:00