アマデウス
|
4e4a10c97d
|
updated c++17 compiler flags (#4983)
|
1 year ago |
Xu Kai
|
611a5a80ca
|
[inference] Add smmoothquant for llama (#4904)
* [inference] add int8 rotary embedding kernel for smoothquant (#4843)
* [inference] add smoothquant llama attention (#4850)
* add smoothquant llama attention
* remove uselss code
* remove useless code
* fix import error
* rename file name
* [inference] add silu linear fusion for smoothquant llama mlp (#4853)
* add silu linear
* update skip condition
* catch smoothquant cuda lib exception
* prcocess exception for tests
* [inference] add llama mlp for smoothquant (#4854)
* add llama mlp for smoothquant
* fix down out scale
* remove duplicate lines
* add llama mlp check
* delete useless code
* [inference] add smoothquant llama (#4861)
* add smoothquant llama
* fix attention accuracy
* fix accuracy
* add kv cache and save pretrained
* refactor example
* delete smooth
* refactor code
* [inference] add smooth function and delete useless code for smoothquant (#4895)
* add smooth function and delete useless code
* update datasets
* remove duplicate import
* delete useless file
* refactor codes (#4902)
* rafactor code
* add license
* add torch-int and smoothquant license
|
1 year ago |
Tong Li
|
bbbcac26e8
|
fix format (#4815)
|
1 year ago |
Xu Kai
|
946ab56c48
|
[feature] add gptq for inference (#4754)
* [gptq] add gptq kernel (#4416)
* add gptq
* refactor code
* fix tests
* replace auto-gptq
* rname inferance/quant
* refactor test
* add auto-gptq as an option
* reset requirements
* change assert and check auto-gptq
* add import warnings
* change test flash attn version
* remove example
* change requirements of flash_attn
* modify tests
* [skip ci] change requirements-test
* [gptq] faster gptq cuda kernel (#4494)
* [skip ci] add cuda kernels
* add license
* [skip ci] fix max_input_len
* format files & change test size
* [skip ci]
* [gptq] add gptq tensor parallel (#4538)
* add gptq tensor parallel
* add gptq tp
* delete print
* add test gptq check
* add test auto gptq check
* [gptq] combine gptq and kv cache manager (#4706)
* combine gptq and kv cache manager
* add init bits
* delete useless code
* add model path
* delete usless print and update test
* delete usless import
* move option gptq to shard config
* change replace linear to shardformer
* update bloom policy
* delete useless code
* fix import bug and delete uselss code
* change colossalai/gptq to colossalai/quant/gptq
* update import linear for tests
* delete useless code and mv gptq_kernel to kernel directory
* fix triton kernel
* add triton import
|
1 year ago |
Hongxin Liu
|
079bf3cb26
|
[misc] update pre-commit and run all files (#4752)
* [misc] update pre-commit
* [misc] run pre-commit
* [misc] remove useless configuration files
* [misc] ignore cuda for clang-format
|
1 year ago |
Mashiro
|
cfa607080f
|
[Fix] Fix compile error (#4357)
|
1 year ago |
Yanming W
|
269150b6f4
|
[Docker] Fix a couple of build issues (#3691)
|
2 years ago |
digger-yu
|
b7141c36dd
|
[CI] fix some spelling errors (#3707)
* fix spelling error with examples/comminity/
* fix spelling error with tests/
* fix some spelling error with tests/ colossalai/ etc.
|
2 years ago |
digger-yu
|
7570d9ae3d
|
[doc] fix op_builder/README.md (#3597)
Optimization Code
change "requries" to "requires"
|
2 years ago |
digger-yu
|
d96567bb5d
|
[misc] op_builder/builder.py (#3593)
Optimization Code
The source code has not been modified, only a few spelling errors in the comments have been changed
|
2 years ago |
Hongxin Liu
|
173dad0562
|
[misc] add verbose arg for zero and op builder (#3552)
* [misc] add print verbose
* [gemini] add print verbose
* [zero] add print verbose for low level
* [misc] add print verbose for op builder
|
2 years ago |
digger-yu
|
77efdfe1dd
|
[doc] Update README.md (#3549)
Format Optimization ,Add [] outside of DeepSpeed
|
2 years ago |
digger-yu
|
3f760da9f0
|
Update README.md (#3548)
Delete more ")"
|
2 years ago |
ver217
|
823f3b9cf4
|
[doc] add deepspeed citation and copyright (#2996)
* [doc] add deepspeed citation and copyright
* [doc] add deepspeed citation and copyright
* [doc] add deepspeed citation and copyright
|
2 years ago |
Yasyf Mohamedali
|
19fa0e57f6
|
Remove extraneous comma (#2993)
Prevents `TypeError: category must be a Warning subclass, not 'str'`.
|
2 years ago |
Frank Lee
|
3a5d93bc2c
|
[kernel] cached the op kernel and fixed version check (#2886)
* [kernel] cached the op kernel and fixed version check
* polish code
* polish code
|
2 years ago |
Frank Lee
|
dd14783f75
|
[kernel] fixed repeated loading of kernels (#2549)
* [kernel] fixed repeated loading of kernels
* polish code
* polish code
|
2 years ago |
Frank Lee
|
40d376c566
|
[setup] support pre-build and jit-build of cuda kernels (#2374)
* [setup] support pre-build and jit-build of cuda kernels
* polish code
* polish code
* polish code
* polish code
* polish code
* polish code
|
2 years ago |
Frank Lee
|
8711310cda
|
[setup] remove torch dependency (#2333)
|
2 years ago |
Jiarui Fang
|
db6eea3583
|
[builder] reconfig op_builder for pypi install (#2314)
|
2 years ago |
Frank Lee
|
9b765e7a69
|
[setup] removed the build dependency on colossalai (#2307)
|
2 years ago |