ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI

History

Yuanheng Zhao 677cbfacf8 [Fix/Example] Fix Llama Inference Loading Data Type (#5763 ) * [fix/example] fix llama inference loading dtype * revise loading dtype of benchmark llama3		2024-05-30 13:48:46 +08:00
..
benchmark_ops	add paged-attetionv2: support seq length split across thread block (#5707 )	2024-05-14 12:46:54 +08:00
client	[Inference]Fix readme and example for API server (#5742 )	2024-05-24 10:03:05 +08:00
llama	[Fix/Example] Fix Llama Inference Loading Data Type (#5763 )	2024-05-30 13:48:46 +08:00