12 Commits (6a3086a5055235e51a1bca8a20c4bd967409a259)

Author SHA1 Message Date
YeAnbang e53e729d8e
[Feature] Add document retrieval QA (#5020) 1 year ago
Xu Kai 611a5a80ca
[inference] Add smmoothquant for llama (#4904) 1 year ago
Xu Kai 946ab56c48
[feature] add gptq for inference (#4754) 1 year ago
Cuiqing Li bce0f16702
[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577) 1 year ago
zbian 7bc0afc901 updated flash attention usage 2 years ago
ver217 090f14fd6b
[misc] add reference (#2930) 2 years ago
Frank Lee 918bc94b6b
[triton] added copyright information for flash attention (#2835) 2 years ago
YuliangLiu0306 2059fdd6b0
[hotfix] add copyright for solver and device mesh (#2803) 2 years ago
binmakeswell d00d905b86
[NFC] polish license (#1999) 2 years ago
binmakeswell 8a29ce5443
polish license (#1522) 2 years ago
Jiarui Fang 8f74fbd9c9 polish license (#300) 3 years ago
アマデウス 2ebaefc542
Initial commit 3 years ago