ColossalAI/tests
Jianghai cf579ff46d
[Inference] Dynamic Batching Inference, online and offline (#4953)
* [inference] Dynamic Batching for Single and Multiple GPUs (#4831)

* finish batch manager

* 1

* first

* fix

* fix dynamic batching

* llama infer

* finish test

* support different lengths generating

* del prints

* del prints

* fix

* fix bug

---------

Co-authored-by: CjhHa1 <cjh18671720497outlook.com>

* [inference] Async dynamic batching  (#4894)

* finish input and output logic

* add generate

* test forward

* 1

* [inference]Re push async dynamic batching (#4901)

* adapt to ray server

* finish async

* finish test

* del test

---------

Co-authored-by: yuehuayingxueluo <867460659@qq.com>

* Revert "[inference]Re push async dynamic batching (#4901)" (#4905)

This reverts commit fbf3c09e67.

* Revert "[inference] Async dynamic batching  (#4894)"

This reverts commit fced140250.

* Revert "[inference] Async dynamic batching  (#4894)" (#4909)

This reverts commit fced140250.

* Add Ray Distributed Environment Init Scripts

* support DynamicBatchManager base function

* revert _set_tokenizer version

* add driver async generate

* add async test

* fix bugs in test_ray_dist.py

* add get_tokenizer.py

* fix code style

* fix bugs about No module named 'pydantic' in ci test

* fix bugs in ci test

* fix bugs in ci test

* fix bugs in ci test

* [infer]Add Ray Distributed Environment Init Scripts (#4911)

* Revert "[inference] Async dynamic batching  (#4894)"

This reverts commit fced140250.

* Add Ray Distributed Environment Init Scripts

* support DynamicBatchManager base function

* revert _set_tokenizer version

* add driver async generate

* add async test

* fix bugs in test_ray_dist.py

* add get_tokenizer.py

* fix code style

* fix bugs about No module named 'pydantic' in ci test

* fix bugs in ci test

* fix bugs in ci test

* fix bugs in ci test

* support dynamic batch for bloom model and is_running function

* [Inference]Test for new Async engine (#4935)

* infer engine

* infer engine

* test engine

* test engine

* new manager

* change step

* add

* test

* fix

* fix

* finish test

* finish test

* finish test

* finish test

* add license

---------

Co-authored-by: yuehuayingxueluo <867460659@qq.com>

* add assertion for config (#4947)

* [Inference] Finish dynamic batching offline test (#4948)

* test

* fix test

* fix quant

* add default

* fix

* fix some bugs

* fix some bugs

* fix

* fix bug

* fix bugs

* reset param

---------

Co-authored-by: yuehuayingxueluo <867460659@qq.com>
Co-authored-by: Cuiqing Li <lixx3527@gmail.com>
Co-authored-by: CjhHa1 <cjh18671720497outlook.com>
2023-10-30 10:52:19 +08:00
..
kit [Inference] Dynamic Batching Inference, online and offline (#4953) 2023-10-30 10:52:19 +08:00
test_analyzer [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_auto_parallel [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_autochunk [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_booster [gemini] support gradient accumulation (#4869) 2023-10-17 14:07:21 +08:00
test_checkpoint_io [hotfix] fix lr scheduler bug in torch 2.0 (#4864) 2023-10-12 14:04:24 +08:00
test_cluster [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_config [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_device [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_fx [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_gptq [feature] add gptq for inference (#4754) 2023-09-22 11:02:50 +08:00
test_infer [Inference] Dynamic Batching Inference, online and offline (#4953) 2023-10-30 10:52:19 +08:00
test_infer_ops [Refactor] Integrated some lightllm kernels into token-attention (#4946) 2023-10-19 22:22:47 +08:00
test_lazy [lazy] support from_pretrained (#4801) 2023-09-26 11:04:11 +08:00
test_legacy [test] merge old components to test to model zoo (#4945) 2023-10-20 10:35:08 +08:00
test_moe [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_optimizer [test] merge old components to test to model zoo (#4945) 2023-10-20 10:35:08 +08:00
test_pipeline [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_shardformer [hotfix] fix torch 2.0 compatibility (#4936) 2023-10-18 11:05:25 +08:00
test_smoothquant [inference] Add smmoothquant for llama (#4904) 2023-10-16 11:28:44 +08:00
test_tensor [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_utils [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_zero [test] merge old components to test to model zoo (#4945) 2023-10-20 10:35:08 +08:00
__init__.py [zero] Update sharded model v2 using sharded param v2 (#323) 2022-03-11 15:50:28 +08:00