Commit Graph

2 Commits (3d211ff81b8036748798464e92a866a5ba5074bd)

Author SHA1 Message Date
Jianghai 3d211ff81b [Inference] Finish Online Serving Test, add streaming output api, continuous batching test and example (#5432)
* finish online test and add examples

* fix test_contionus_batching

* fix some bugs

* fix bash

* fix

* fix inference

* finish revision

* fix typos

* revision
2024-04-11 10:27:42 +08:00
Jianghai 1572af2432 [Inference] ADD async and sync Api server using FastAPI (#5396)
* add api server

* fix

* add

* add completion service and fix bug

* add generation config

* revise shardformer

* fix bugs

* add docstrings and fix some bugs

* fix bugs and add choices for prompt template
2024-04-11 10:27:42 +08:00