ColossalAI/colossalai
Xu Kai 946ab56c48
[feature] add gptq for inference (#4754)
* [gptq] add gptq kernel (#4416)

* add gptq

* refactor code

* fix tests

* replace auto-gptq

* rname inferance/quant

* refactor test

* add auto-gptq as an option

* reset requirements

* change assert and check auto-gptq

* add import warnings

* change test flash attn version

* remove example

* change requirements of flash_attn

* modify tests

* [skip ci] change requirements-test

* [gptq] faster gptq cuda kernel (#4494)

* [skip ci] add cuda kernels

* add license

* [skip ci] fix max_input_len

* format files & change test size

* [skip ci]

* [gptq] add gptq tensor parallel (#4538)

* add gptq tensor parallel

* add gptq tp

* delete print

* add test gptq check

* add test auto gptq check

* [gptq] combine gptq and kv cache manager (#4706)

* combine gptq and kv cache manager

* add init bits

* delete useless code

* add model path

* delete usless print and update test

* delete usless import

* move option gptq to shard config

* change replace linear to shardformer

* update bloom policy

* delete useless code

* fix import bug and delete uselss code

* change colossalai/gptq to colossalai/quant/gptq

* update import linear for tests

* delete useless code and mv gptq_kernel to kernel directory

* fix triton kernel

* add triton import
2023-09-22 11:02:50 +08:00
..
_C [setup] support pre-build and jit-build of cuda kernels (#2374) 2023-01-06 20:50:26 +08:00
_analyzer [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
amp [shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758) 2023-09-20 18:29:37 +08:00
auto_parallel [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
autochunk [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
booster [shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758) 2023-09-20 18:29:37 +08:00
checkpoint_io [shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758) 2023-09-20 18:29:37 +08:00
cli [bug] Fix the version check bug in colossalai run when generating the cmd. (#4713) 2023-09-22 10:50:47 +08:00
cluster [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
context [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
device [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
fx [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
inference [feature] add gptq for inference (#4754) 2023-09-22 11:02:50 +08:00
interface [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
kernel [feature] add gptq for inference (#4754) 2023-09-22 11:02:50 +08:00
lazy [lazy] support torch 2.0 (#4763) 2023-09-21 16:30:23 +08:00
legacy [bug] fix get_default_parser in examples (#4764) 2023-09-21 10:42:25 +08:00
logging [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
nn [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
pipeline [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
shardformer [feature] add gptq for inference (#4754) 2023-09-22 11:02:50 +08:00
tensor [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
testing [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
utils [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
zero [shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758) 2023-09-20 18:29:37 +08:00
__init__.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
initialize.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00