mirror of https://github.com/hpcaitech/ColossalAI
![]() * [inference] Dynamic Batching for Single and Multiple GPUs (#4831) * finish batch manager * 1 * first * fix * fix dynamic batching * llama infer * finish test * support different lengths generating * del prints * del prints * fix * fix bug --------- Co-authored-by: CjhHa1 <cjh18671720497outlook.com> * [inference] Async dynamic batching (#4894) * finish input and output logic * add generate * test forward * 1 * [inference]Re push async dynamic batching (#4901) * adapt to ray server * finish async * finish test * del test --------- Co-authored-by: yuehuayingxueluo <867460659@qq.com> * Revert "[inference]Re push async dynamic batching (#4901)" (#4905) This reverts commit |
||
---|---|---|
.. | ||
__init__.py | ||
albert.py | ||
bert.py | ||
blip2.py | ||
bloom.py | ||
chatglm2.py | ||
gpt.py | ||
llama.py | ||
opt.py | ||
sam.py | ||
t5.py | ||
vit.py | ||
whisper.py |