ColossalAI/examples/inference/serving
Yuanheng Zhao 573f270537
[Infer] Serving example w/ ray-serve (multiple GPU case) (#4841)
* fix imports

* add ray-serve with Colossal-Infer tp

* trivial: send requests script

* add README

* fix worker port

* fix readme

* use app builder and autoscaling

* trivial: input args

* clean code; revise readme

* testci (skip example test)

* use auto model/tokenizer

* revert imports fix (fixed in other PRs)
2023-10-02 17:48:38 +08:00
..
ray_serve [Infer] Serving example w/ ray-serve (multiple GPU case) (#4841) 2023-10-02 17:48:38 +08:00
torch_serve [Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) (#4771) 2023-10-02 17:42:37 +08:00
test_ci.sh [Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) (#4771) 2023-10-02 17:42:37 +08:00