ColossalAI

History

Bin Jia 424629fea0 [shardformer/sequence parallel] Cherry pick commit to new branch (#4450 ) * [shardformer/sequence parallel] Support sequence parallel for gpt2 (#4384) * [sequence parallel] add sequence parallel linear col/row support (#4336) * add sequence parallel linear col/row support * add annotation * add annotation * add support for gpt2 fused qkv linear layer * support sequence parallel in GPT2 * add docstring and note * add requirments * remove unused flash-attb * modify flash attn test * modify flash attn setting * modify flash attn code * add assert before divide, rename forward function * [shardformer/test] fix gpt2 test with seq-parallel * [shardformer/sequence parallel] Overlap input gather and grad computation during col backward (#4401) * overlap gather input / grad computing during col backward * modify test for overlap * simplify code * fix code and modify cuda stream synchronize * [shardformer/sequence parallel] polish code		2023-08-16 15:41:20 +08:00
..
chatglm2_6b	[pipeline] add chatglm (#4363 )	2023-08-15 23:25:14 +08:00
__init__.py	[shardformer] added development protocol for standardization (#4149 )	2023-07-04 16:05:01 +08:00
bert.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
blip2.py	[shardformer] update shardformer to use flash attention 2 (#4392 )	2023-08-15 23:25:14 +08:00
bloom.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
chatglm.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
gpt2.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
gpt2_seq.py	[shardformer/sequence parallel] Cherry pick commit to new branch (#4450 )	2023-08-16 15:41:20 +08:00
jit.py	[Shardformer] Merge flash attention branch to pipeline branch (#4362 )	2023-08-15 23:25:14 +08:00
llama.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
opt.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
sam.py	[Shardformer] Merge flash attention branch to pipeline branch (#4362 )	2023-08-15 23:25:14 +08:00
t5.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
vit.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
whisper.py	[shardformer] update shardformer to use flash attention 2 (#4392 )	2023-08-15 23:25:14 +08:00