Commit Graph

2 Commits (2226c6836c3aef6bb6cc6a4aec7c9a874799a2b1)

Author SHA1 Message Date
Yuanheng Zhao 2226c6836c
[feature] add KV cache manager for llama & bloom inference (#4495)
* add kv cache memory manager

* add stateinfo during inference

* format

* format

* rename file

* add kv cache test

* revise on BatchInferState

* file dir change
2023-08-24 16:44:14 +08:00
Jianghai c427366024
[infer] Infer/llama demo (#4503)
* add

* add infer example

* finish

* finish

* stash

* fix
2023-08-24 15:42:41 +08:00