ColossalAI

mirror of https://github.com/hpcaitech/ColossalAI

History

Steve Luo 7806842f2d add paged-attetionv2: support seq length split across thread block (#5707 )		2024-05-14 12:46:54 +08:00
..
__init__.py	fix bugs in request_handler	2024-01-11 13:39:56 +00:00
glide_llama.py	[Inference/SpecDec] Support GLIDE Drafter Model (#5455 )	2024-04-10 11:07:52 +08:00
nopadding_baichuan.py	add paged-attetionv2: support seq length split across thread block (#5707 )	2024-05-14 12:46:54 +08:00
nopadding_llama.py	add paged-attetionv2: support seq length split across thread block (#5707 )	2024-05-14 12:46:54 +08:00