* modify coloattention
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
* fix
fxi
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* add cuda KVCache kernel
* annotation benchmark_kvcache_copy
* add use cuda
* fix import path
* move benchmark scripts to example/
* rm benchmark codes in test_kv_cache_memcpy.py
* rm redundancy codes
* rm redundancy codes
* pr was modified according to the review