Commit Graph

4 Commits (8c69debdc7128e1b8839f12aa3f19ad327569017)

Author SHA1 Message Date
yuehuayingxueluo 4f28cb43c0
[inference]Optimize the usage of the mid tensors space in flash attn (#5304)
10 months ago
yuehuayingxueluo fab294c7f4 fix CI bugs
11 months ago
Jianghai e545a871b8 [Hotfix] Fix accuracy and align attention method api with Triton kernel (#5229)
11 months ago
Jianghai 0e616462a7 [Inference] add logit processor and request handler (#5166)
11 months ago