mirror of https://github.com/hpcaitech/ColossalAI
[doc] fix typo in opt inference tutorial (#2849)
parent
935346430f
commit
597914317b
|
@ -50,7 +50,7 @@ python opt_fastapi.py <model> --queue_size <QueueSize>
|
|||
```
|
||||
The `<QueueSize>` can be an integer in `[0, MAXINT]`. If it's `0`, the request queue size is infinite. If it's a positive integer, when the request queue is full, incoming requests will be dropped (the HTTP status code of response will be 406).
|
||||
|
||||
### Configure bathcing
|
||||
### Configure batching
|
||||
```shell
|
||||
python opt_fastapi.py <model> --max_batch_size <MaxBatchSize>
|
||||
```
|
||||
|
@ -85,4 +85,4 @@ Then open the web interface link which is on your console.
|
|||
See [script/processing_ckpt_66b.py](./script/processing_ckpt_66b.py).
|
||||
|
||||
## OPT-175B
|
||||
See [script/process-opt-175b](./script/process-opt-175b/).
|
||||
See [script/process-opt-175b](./script/process-opt-175b/).
|
||||
|
|
Loading…
Reference in New Issue