ColossalAI/examples/inference
Yuanheng Zhao 04863a9b14
[example] Update Llama Inference example (#5629)
* [example] add infernece benchmark llama3

* revise inference config - arg

* remove unused args

* add llama generation demo script

* fix init rope in llama policy

* add benchmark-llama3 - cleanup
2024-04-23 22:23:07 +08:00
..
benchmark_ops feat baichuan2 rmsnorm whose hidden size equals to 5120 (#5611) 2024-04-19 15:34:53 +08:00
benchmark_llama.py [example] Update Llama Inference example (#5629) 2024-04-23 22:23:07 +08:00
benchmark_llama3.py [example] Update Llama Inference example (#5629) 2024-04-23 22:23:07 +08:00
build_smoothquant_weight.py [inference] refactor examples and fix schedule (#5077) 2023-11-21 10:46:03 +08:00
llama_generation.py [example] Update Llama Inference example (#5629) 2024-04-23 22:23:07 +08:00
run_benchmark.sh The writing style of tail processing and the logic related to macro definitions have been optimized. (#5519) 2024-03-28 10:42:51 +08:00
run_llama_inference.py [npu] change device to accelerator api (#5239) 2024-01-09 10:20:05 +08:00