ColossalAI

Author	SHA1	Message	Date
Yuanheng Zhao	7b249c76e5	[Fix] Fix spec-dec Glide LlamaModel for compatibility with transformers (#5837 ) * fix glide llama model * revise	2024-06-19 15:37:53 +08:00
yuehuayingxueluo	b45000f839	[Inference]Add Streaming LLM (#5745 ) * Add Streaming LLM * add some parameters to llama_generation.py * verify streamingllm config * add test_streamingllm.py * modified according to the opinions of review * add Citation * change _block_tables tolist	2024-06-05 10:51:19 +08:00
Yuanheng Zhao	677cbfacf8	[Fix/Example] Fix Llama Inference Loading Data Type (#5763 ) * [fix/example] fix llama inference loading dtype * revise loading dtype of benchmark llama3	2024-05-30 13:48:46 +08:00
Yuanheng Zhao	8bcfe360fd	[example] Update Inference Example (#5725 ) * [example] update inference example	2024-05-17 11:28:53 +08:00
Yuanheng Zhao	55cc7f3df7	[Fix] Fix Inference Example, Tests, and Requirements (#5688 ) * clean requirements * modify example inference struct * add test ci scripts * mark test_infer as submodule * rm deprecated cls & deps * import of HAS_FLASH_ATTN * prune inference tests to be run * prune triton kernel tests * increment pytest timeout mins * revert import path in openmoe	2024-05-08 11:30:15 +08:00