ColossalAI

History

Steve Luo 7806842f2d add paged-attetionv2: support seq length split across thread block (#5707 )		2024-05-14 12:46:54 +08:00
..
cpu_adam	[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613 )	2024-04-24 14:17:54 +08:00
flash_attention	[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613 )	2024-04-24 14:17:54 +08:00
inference	add paged-attetionv2: support seq length split across thread block (#5707 )	2024-05-14 12:46:54 +08:00
layernorm	[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613 )	2024-04-24 14:17:54 +08:00
moe	[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613 )	2024-04-24 14:17:54 +08:00
optimizer	[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613 )	2024-04-24 14:17:54 +08:00
softmax	[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613 )	2024-04-24 14:17:54 +08:00
__init__.py	[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613 )	2024-04-24 14:17:54 +08:00