ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI

History

Yuanheng Zhao f342a93871 [Fix] Remove obsolete files - inference (#5650 )		2024-04-25 22:04:59 +08:00
..
benchmark_ops	[Inference/Kernel] Optimize paged attention: Refactor key cache layout (#5643 )	2024-04-25 14:24:02 +08:00
benchmark_llama.py	[Fix/Inference]Fix vllm benchmark (#5630 )	2024-04-24 14:51:36 +08:00
benchmark_llama3.py	[Fix/Inference]Fix vllm benchmark (#5630 )	2024-04-24 14:51:36 +08:00
llama_generation.py	[example] Update Llama Inference example (#5629 )	2024-04-23 22:23:07 +08:00
run_benchmark.sh	[Fix/Inference]Fix vllm benchmark (#5630 )	2024-04-24 14:51:36 +08:00