7 Commits (39f2582e987871c198f2f2526cd4435cbd569741)

Author SHA1 Message Date
Xu Kai fdec650bb4
fix test llama (#4884) 1 year ago
Bin Jia 08a9f76b2f
[Pipeline Inference] Sync pipeline inference branch to main (#4820) 1 year ago
Xu Kai d1fcc0fa4d
[infer] fix test bug (#4838) 1 year ago
Jianghai 013a4bedf0
[inference]fix import bug and delete down useless init (#4830) 1 year ago
Jianghai ce7ade3882
[inference] chatglm2 infer demo (#4724) 1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752) 1 year ago
Cuiqing Li bce0f16702
[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577) 1 year ago