ColossalAI/colossalai/kernel
yuehuayingxueluo 2a718c8be8
Optimized the execution interval time between cuda kernels caused by view and memcopy (#5390)
* opt_view_and_memcopy

* fix bugs in ci

* fix ci bugs

* update benchmark scripts

* fix ci bugs
2024-02-21 13:23:57 +08:00
..
jit [npu] change device to accelerator api (#5239) 2024-01-09 10:20:05 +08:00
triton Optimized the execution interval time between cuda kernels caused by view and memcopy (#5390) 2024-02-21 13:23:57 +08:00
__init__.py [feat] refactored extension module (#5298) 2024-01-25 17:01:48 +08:00
extensions [feat] refactored extension module (#5298) 2024-01-25 17:01:48 +08:00
kernel_loader.py [feat] refactored extension module (#5298) 2024-01-25 17:01:48 +08:00