Commit Graph

6 Commits (feat/speculative-decoding)

Author SHA1 Message Date
Yuanheng Zhao 1dedb57747
[Fix/Infer] Remove unused deps and revise requirements (#5341)
10 months ago
yuehuayingxueluo 8daee26989 [Inference] Add the logic of the inference engine (#5173)
11 months ago
Hongxin Liu 1cd7efc520
[inference] refactor examples and fix schedule (#5077)
1 year ago
Xu Kai fb103cfd6e
[inference] update examples and engine (#5073)
1 year ago
Cuiqing Li (李崔卿) bce919708f
[Kernels]added flash-decoidng of triton (#5063)
1 year ago
Xu Kai fd6482ad8c
[inference] Refactor inference architecture (#5057)
1 year ago