diff --git a/examples/tutorial/opt/inference/README.md b/examples/tutorial/opt/inference/README.md index 5bacac0d7..20ad4a23f 100644 --- a/examples/tutorial/opt/inference/README.md +++ b/examples/tutorial/opt/inference/README.md @@ -50,7 +50,7 @@ python opt_fastapi.py --queue_size ``` The `` can be an integer in `[0, MAXINT]`. If it's `0`, the request queue size is infinite. If it's a positive integer, when the request queue is full, incoming requests will be dropped (the HTTP status code of response will be 406). -### Configure bathcing +### Configure batching ```shell python opt_fastapi.py --max_batch_size ``` @@ -85,4 +85,4 @@ Then open the web interface link which is on your console. See [script/processing_ckpt_66b.py](./script/processing_ckpt_66b.py). ## OPT-175B -See [script/process-opt-175b](./script/process-opt-175b/). \ No newline at end of file +See [script/process-opt-175b](./script/process-opt-175b/).