ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI

History

Steve Luo 7806842f2d add paged-attetionv2: support seq length split across thread block (#5707 )		2024-05-14 12:46:54 +08:00
..
__init__.py	[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613 )	2024-04-24 14:17:54 +08:00
inference.cpp	add paged-attetionv2: support seq length split across thread block (#5707 )	2024-05-14 12:46:54 +08:00
inference_ops_cuda.py	[Inference/Feat] Add convert_fp8 op for fp8 test in the future (#5706 )	2024-05-10 18:39:54 +08:00