ColossalAI/examples/inference
Yuanheng Zhao 8bcfe360fd
[example] Update Inference Example (#5725)
* [example] update inference example
2024-05-17 11:28:53 +08:00
..
benchmark_ops add paged-attetionv2: support seq length split across thread block (#5707) 2024-05-14 12:46:54 +08:00
client [Inference] Fix API server, test and example (#5712) 2024-05-15 15:47:31 +08:00
llama [example] Update Inference Example (#5725) 2024-05-17 11:28:53 +08:00