ColossalAI/examples/inference
Yuanheng Zhao f342a93871
[Fix] Remove obsolete files - inference (#5650)
2024-04-25 22:04:59 +08:00
..
benchmark_ops [Inference/Kernel] Optimize paged attention: Refactor key cache layout (#5643) 2024-04-25 14:24:02 +08:00
benchmark_llama.py [Fix/Inference]Fix vllm benchmark (#5630) 2024-04-24 14:51:36 +08:00
benchmark_llama3.py [Fix/Inference]Fix vllm benchmark (#5630) 2024-04-24 14:51:36 +08:00
llama_generation.py [example] Update Llama Inference example (#5629) 2024-04-23 22:23:07 +08:00
run_benchmark.sh [Fix/Inference]Fix vllm benchmark (#5630) 2024-04-24 14:51:36 +08:00