Commit Graph

3 Commits (feature/inference-refactor)

Author SHA1 Message Date
Xu Kai c6295c3381
[Refactor] remove useless inference code (#5022)
* remove useless code

* fix quant model

* fix test import bug

* mv original inference legacy

* fix chatglm2
2023-11-10 14:47:06 +08:00
Xu Kai 450115bd0f [refactor] refactor gptq and smoothquant llama (#5012)
* refactor gptq and smoothquant llama

* fix import error

* fix linear import torch-int

* fix smoothquant llama import error

* fix import accelerate error

* fix bug

* fix import smooth cuda

* fix smoothcuda
2023-11-09 10:12:11 +08:00
Xu Kai f747d13040 [inference] support only TP (#4998)
* support only tp

* enable tp
2023-11-09 10:12:11 +08:00