ColossalAI

History

flybird11111 c7d6975d29 [shardformer] fix GPT2DoubleHeadsModel (#4703 )		2023-09-13 15:57:16 +08:00
..
chatglm2_6b	[pipeline] add chatglm (#4363 )	2023-08-15 23:25:14 +08:00
__init__.py	[shardformer] added development protocol for standardization (#4149 )	2023-07-04 16:05:01 +08:00
bert.py	[shardformer] vit/llama/t5 ignore the sequence parallelism flag and some fix. (#4498 )	2023-08-24 15:50:02 +08:00
blip2.py	[shardformer] update shardformer to use flash attention 2 (#4392 )	2023-08-15 23:25:14 +08:00
bloom.py	[shardformer] bloom support sequence parallel (#4465 )	2023-08-18 15:34:18 +08:00
chatglm2.py	[shardformer] chatglm support sequence parallel (#4482 )	2023-08-22 23:59:31 +08:00
gpt2.py	[shardformer] fix GPT2DoubleHeadsModel (#4703 )	2023-09-13 15:57:16 +08:00
jit.py	[Shardformer] Merge flash attention branch to pipeline branch (#4362 )	2023-08-15 23:25:14 +08:00
llama.py	[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577 )	2023-09-12 01:22:56 +08:00
opt.py	[shardformer] update llama2/opt finetune example and fix llama2 policy (#4645 )	2023-09-09 22:45:36 +08:00
sam.py	[Shardformer] Merge flash attention branch to pipeline branch (#4362 )	2023-08-15 23:25:14 +08:00
t5.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
vit.py	[example] update vit example for hybrid parallel plugin (#4641 )	2023-09-07 17:38:45 +08:00
whisper.py	[shardformer] Pipeline/whisper (#4456 )	2023-08-18 21:29:25 +08:00