ColossalAI/colossalai/inference/modeling
yuehuayingxueluo 2a718c8be8
Optimized the execution interval time between cuda kernels caused by view and memcopy (#5390)
* opt_view_and_memcopy

* fix bugs in ci

* fix ci bugs

* update benchmark scripts

* fix ci bugs
2024-02-21 13:23:57 +08:00
..
layers [Inference]Fused the gate and up proj in mlp,and optimized the autograd process. (#5365) 2024-02-06 19:38:25 +08:00
models Optimized the execution interval time between cuda kernels caused by view and memcopy (#5390) 2024-02-21 13:23:57 +08:00
policy Optimized the execution interval time between cuda kernels caused by view and memcopy (#5390) 2024-02-21 13:23:57 +08:00
__init__.py [doc] updated inference readme (#5343) 2024-02-02 14:31:10 +08:00