ColossalAI/tests
Xu Kai fd6482ad8c
[inference] Refactor inference architecture (#5057)
* [inference] support only TP (#4998)

* support only tp

* enable tp

* add support for bloom (#5008)

* [refactor] refactor gptq and smoothquant llama (#5012)

* refactor gptq and smoothquant llama

* fix import error

* fix linear import torch-int

* fix smoothquant llama import error

* fix import accelerate error

* fix bug

* fix import smooth cuda

* fix smoothcuda

* [Inference Refactor] Merge chatglm2 with pp and tp (#5023)

merge chatglm with pp and tp

* [Refactor] remove useless inference code (#5022)

* remove useless code

* fix quant model

* fix test import bug

* mv original inference legacy

* fix chatglm2

* [Refactor] refactor policy search and quant type controlling in inference (#5035)

* [Refactor] refactor policy search and quant type controling in inference

* [inference] update readme (#5051)

* update readme

* update readme

* fix architecture

* fix table

* fix table

* [inference] udpate example (#5053)

* udpate example

* fix run.sh

* fix rebase bug

* fix some errors

* update readme

* add some features

* update interface

* update readme

* update benchmark

* add requirements-infer

---------

Co-authored-by: Bin Jia <45593998+FoolPlayer@users.noreply.github.com>
Co-authored-by: Zhongkai Zhao <kanezz620@gmail.com>
2023-11-19 21:05:05 +08:00
..
kit [Inference] Dynamic Batching Inference, online and offline (#4953) 2023-10-30 10:52:19 +08:00
test_analyzer [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_auto_parallel [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_autochunk [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_booster [gemini] gemini support extra-dp (#5043) 2023-11-16 21:03:04 +08:00
test_checkpoint_io [gemini] gemini support extra-dp (#5043) 2023-11-16 21:03:04 +08:00
test_cluster [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_config [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_device [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_fx [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_gptq [feature] add gptq for inference (#4754) 2023-09-22 11:02:50 +08:00
test_infer [inference] Refactor inference architecture (#5057) 2023-11-19 21:05:05 +08:00
test_infer_ops/triton [inference] Refactor inference architecture (#5057) 2023-11-19 21:05:05 +08:00
test_lazy [lazy] support from_pretrained (#4801) 2023-09-26 11:04:11 +08:00
test_legacy [test] merge old components to test to model zoo (#4945) 2023-10-20 10:35:08 +08:00
test_moe [hotfix]: modify create_ep_hierarchical_group and add test (#5032) 2023-11-17 10:53:00 +08:00
test_optimizer [test] merge old components to test to model zoo (#4945) 2023-10-20 10:35:08 +08:00
test_pipeline [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_shardformer [gemini] gemini support extra-dp (#5043) 2023-11-16 21:03:04 +08:00
test_smoothquant [inference] Add smmoothquant for llama (#4904) 2023-10-16 11:28:44 +08:00
test_tensor [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_utils [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_zero [gemini] gemini support extra-dp (#5043) 2023-11-16 21:03:04 +08:00
__init__.py [zero] Update sharded model v2 using sharded param v2 (#323) 2022-03-11 15:50:28 +08:00