Commit Graph

11 Commits (8b1b237a5f7e3f7879adddd59a16d0daa5657b49)

Author SHA1 Message Date
Jianghai c6cd629e7a
[Inference]ADD Bench Chatglm2 script (#4963)
1 year ago
Xu Kai 785802e809
[inference] add reference and fix some bugs (#4937)
1 year ago
Cuiqing Li 3a41e8304e
[Refactor] Integrated some lightllm kernels into token-attention (#4946)
1 year ago
Xu Kai 611a5a80ca
[inference] Add smmoothquant for llama (#4904)
1 year ago
Xu Kai d1fcc0fa4d
[infer] fix test bug (#4838)
1 year ago
Jianghai 013a4bedf0
[inference]fix import bug and delete down useless init (#4830)
1 year ago
Yuanheng Zhao 573f270537
[Infer] Serving example w/ ray-serve (multiple GPU case) (#4841)
1 year ago
Yuanheng Zhao 3a74eb4b3a
[Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) (#4771)
1 year ago
Xu Kai 946ab56c48
[feature] add gptq for inference (#4754)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Cuiqing Li bce0f16702
[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577)
1 year ago