ColossalAI

History

yuehuayingxueluo 2a718c8be8 Optimized the execution interval time between cuda kernels caused by view and memcopy (#5390 ) * opt_view_and_memcopy * fix bugs in ci * fix ci bugs * update benchmark scripts * fix ci bugs		2024-02-21 13:23:57 +08:00
..
benchmark_llama.py	Optimized the execution interval time between cuda kernels caused by view and memcopy (#5390 )	2024-02-21 13:23:57 +08:00
build_smoothquant_weight.py	[inference] refactor examples and fix schedule (#5077 )	2023-11-21 10:46:03 +08:00
run_benchmark.sh	[Inference]Fused kv copy into rotary calculation (#5383 )	2024-02-21 11:31:48 +08:00
run_llama_inference.py	[npu] change device to accelerator api (#5239 )	2024-01-09 10:20:05 +08:00