5 Commits (main)

Author SHA1 Message Date
傅剑寒 279300dc5f
[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613) 7 months ago
Hongxin Liu 19e1a5cf16
[shardformer] update colo attention to support custom mask (#5510) 8 months ago
yuehuayingxueluo 600881a8ea
[Inference]Add CUDA KVCache Kernel (#5406) 9 months ago
Frank Lee 7cfed5f076
[feat] refactored extension module (#5298) 10 months ago