ColossalAI/examples/inference
yuehuayingxueluo 631862f339
[Inference]Optimize generation process of inference engine (#5356)
* opt inference engine

* fix run_benchmark.sh

* fix generate in engine.py

* rollback tesh_inference_engine.py
2024-02-02 15:38:21 +08:00
..
benchmark_llama.py [Inference]Optimize generation process of inference engine (#5356) 2024-02-02 15:38:21 +08:00
build_smoothquant_weight.py [inference] refactor examples and fix schedule (#5077) 2023-11-21 10:46:03 +08:00
run_benchmark.sh [Inference/opt]Optimize the mid tensor of RMS Norm (#5350) 2024-02-02 15:06:01 +08:00
run_llama_inference.py [npu] change device to accelerator api (#5239) 2024-01-09 10:20:05 +08:00