ColossalAI/examples/inference
Yuanheng Zhao 573f270537
[Infer] Serving example w/ ray-serve (multiple GPU case) (#4841)
* fix imports

* add ray-serve with Colossal-Infer tp

* trivial: send requests script

* add README

* fix worker port

* fix readme

* use app builder and autoscaling

* trivial: input args

* clean code; revise readme

* testci (skip example test)

* use auto model/tokenizer

* revert imports fix (fixed in other PRs)
2023-10-02 17:48:38 +08:00
..
serving [Infer] Serving example w/ ray-serve (multiple GPU case) (#4841) 2023-10-02 17:48:38 +08:00
bench_bloom.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
bench_llama.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
gptq_bloom.py [feature] add gptq for inference (#4754) 2023-09-22 11:02:50 +08:00
gptq_llama.py [feature] add gptq for inference (#4754) 2023-09-22 11:02:50 +08:00