Commit Graph

2 Commits (17cfa5714083a81a505c097f1c411cd28162d922)

Author SHA1 Message Date
Yuanheng Zhao 35af65d240
[Infer] Add TPInferEngine and fix file path (#4532)
* add engine for TP inference

* move file path

* update path

* fix TPInferEngine

* remove unused file

* add engine test demo

* revise TPInferEngine

* fix TPInferEngine, add test

* fix
2023-08-29 18:57:52 +08:00
Yuanheng Zhao 2226c6836c
[feature] add KV cache manager for llama & bloom inference (#4495)
* add kv cache memory manager

* add stateinfo during inference

* format

* format

* rename file

* add kv cache test

* revise on BatchInferState

* file dir change
2023-08-24 16:44:14 +08:00