Commit Graph

5 Commits (fba04e857b57abc54ba4864cbfb3af0461e2c5e7)

Author SHA1 Message Date
傅剑寒 121d7ad629
[Inference] Delete duplicated copy_vector (#5716) 2024-05-14 14:35:33 +08:00
傅剑寒 50104ab340
[Inference/Feat] Add convert_fp8 op for fp8 test in the future (#5706)
* add convert_fp8 op for fp8 test in the future

* rerun ci
2024-05-10 18:39:54 +08:00
傅剑寒 ef8e4ffe31
[Inference/Feat] Add kvcache quant support for fused_rotary_embedding_cache_copy (#5680) 2024-04-30 18:33:53 +08:00
傅剑寒 808ee6e4ad
[Inference/Feat] Feat quant kvcache step2 (#5674) 2024-04-30 11:26:36 +08:00
傅剑寒 279300dc5f
[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613)
* refactor compilation mechanism and unified multi hw

* fix file path bug

* add init.py to make pybind a module to avoid relative path error caused by softlink

* delete duplicated micros

* fix micros bug in gcc
2024-04-24 14:17:54 +08:00