ColossalAI

Author	SHA1	Message	Date
Zhongkai Zhao	361cf63cb0	[Refactor] refactor policy search and quant type controlling in inference (#5035 ) * [Refactor] refactor policy search and quant type controling in inference	2023-11-14 17:26:59 +08:00
Xu Kai	c6295c3381	[Refactor] remove useless inference code (#5022 ) * remove useless code * fix quant model * fix test import bug * mv original inference legacy * fix chatglm2	2023-11-10 14:47:06 +08:00
Bin Jia	81b8f5e76a	[Inference Refactor] Merge chatglm2 with pp and tp (#5023 ) merge chatglm with pp and tp	2023-11-09 14:46:19 +08:00
Xu Kai	450115bd0f	[refactor] refactor gptq and smoothquant llama (#5012 ) * refactor gptq and smoothquant llama * fix import error * fix linear import torch-int * fix smoothquant llama import error * fix import accelerate error * fix bug * fix import smooth cuda * fix smoothcuda	2023-11-09 10:12:11 +08:00
Bin Jia	48d0a58d10	add support for bloom (#5008 )	2023-11-09 10:12:11 +08:00
Xu Kai	f747d13040	[inference] support only TP (#4998 ) * support only tp * enable tp	2023-11-09 10:12:11 +08:00
Bin Jia	b6696beb04	[Pipeline Inference] Merge pp with tp (#4993 ) * refactor pipeline into new CaiInferEngine * updata llama modeling forward * merge tp with pp * update docstring * optimize test workflow and example * fix typo * add assert and todo	2023-11-01 12:46:21 +08:00