ColossalAI/colossalai/kernel/triton
Jianghai de378cd2ab [Inference] Finish Online Serving Test, add streaming output api, continuous batching test and example (#5432)
* finish online test and add examples

* fix test_contionus_batching

* fix some bugs

* fix bash

* fix

* fix inference

* finish revision

* fix typos

* revision
2024-05-08 15:20:52 +00:00
..
__init__.py [Infer] Revise and Adapt Triton Kernels for Spec-Dec (#5401) 2024-04-10 11:07:51 +08:00
context_attn_unpad.py [kernel] Support New KCache Layout - Triton Kernel (#5677) 2024-05-03 17:20:45 +08:00
flash_decoding.py [kernel] Support New KCache Layout - Triton Kernel (#5677) 2024-05-03 17:20:45 +08:00
fused_rotary_embedding.py [Inference]Fused the gate and up proj in mlp,and optimized the autograd process. (#5365) 2024-02-06 19:38:25 +08:00
kvcache_copy.py [kernel] Support New KCache Layout - Triton Kernel (#5677) 2024-05-03 17:20:45 +08:00
llama_act_combine_kernel.py [devops] remove post commit ci (#5566) 2024-04-08 15:09:40 +08:00
no_pad_rotary_embedding.py [Inference] Finish Online Serving Test, add streaming output api, continuous batching test and example (#5432) 2024-05-08 15:20:52 +00:00
qkv_matmul_kernel.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
rms_layernorm.py [fix] multi graphs capture error 2024-03-11 10:49:31 +08:00
rotary_cache_copy.py [Inference]Fused the gate and up proj in mlp,and optimized the autograd process. (#5365) 2024-02-06 19:38:25 +08:00
softmax.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00