Commit Graph

2 Commits (537a3cbc4df445786c8ecf2af0a2998e2fd881b6)

Author SHA1 Message Date
傅剑寒 8ccb6714e7
[Inference/Feat] Add kvcache quantization support for FlashDecoding (#5656) 2024-04-26 19:40:37 +08:00
傅剑寒 279300dc5f
[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613)
* refactor compilation mechanism and unified multi hw

* fix file path bug

* add init.py to make pybind a module to avoid relative path error caused by softlink

* delete duplicated micros

* fix micros bug in gcc
2024-04-24 14:17:54 +08:00