You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ColossalAI/examples/inference
yuehuayingxueluo 0aa27f1961
[Inference]Move benchmark-related code to the example directory. (#5408)
9 months ago
..
benchmark_ops [Inference]Move benchmark-related code to the example directory. (#5408) 9 months ago
benchmark_llama.py Optimized the execution interval time between cuda kernels caused by view and memcopy (#5390) 9 months ago
build_smoothquant_weight.py [inference] refactor examples and fix schedule (#5077) 1 year ago
run_benchmark.sh [Fix/Inference] Fix format of input prompts and input model in inference engine (#5395) 9 months ago
run_llama_inference.py [npu] change device to accelerator api (#5239) 11 months ago