ColossalAI/extensions/pybind
Steve Luo 7806842f2d
add paged-attetionv2: support seq length split across thread block (#5707)
2024-05-14 12:46:54 +08:00
..
cpu_adam [Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613) 2024-04-24 14:17:54 +08:00
flash_attention [Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613) 2024-04-24 14:17:54 +08:00
inference add paged-attetionv2: support seq length split across thread block (#5707) 2024-05-14 12:46:54 +08:00
layernorm [Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613) 2024-04-24 14:17:54 +08:00
moe [Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613) 2024-04-24 14:17:54 +08:00
optimizer [Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613) 2024-04-24 14:17:54 +08:00
softmax [Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613) 2024-04-24 14:17:54 +08:00
__init__.py [Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613) 2024-04-24 14:17:54 +08:00