Commit Graph

2 Commits (fced14025043e29ce816b315f440601188f7f79f)

Author SHA1 Message Date
Jianghai fced140250
[inference] Async dynamic batching (#4894)
* finish input and output logic

* add generate

* test forward

* 1
2023-10-12 18:48:27 +08:00
Jianghai e0757c31fb
[inference] Dynamic Batching for Single and Multiple GPUs (#4831)
* finish batch manager

* 1

* first

* fix

* fix dynamic batching

* llama infer

* finish test

* support different lengths generating

* del prints

* del prints

* fix

* fix bug

---------

Co-authored-by: CjhHa1 <cjh18671720497outlook.com>
2023-10-11 17:52:52 +08:00