ColossalAI/colossalai/shardformer/policies
Bin Jia 7c8be77081
[shardformer/sequence parallel] support gpt2 seq parallel with pp/dp/tp (#4460)
* support gpt2 seq parallel with pp/dp/tp

* fix a bug when waiting for stream done

* delete unused gpt2_seq file
2023-08-18 11:21:53 +08:00
..
__init__.py [shardformer] init shardformer code structure (#3731) 2023-07-04 16:05:01 +08:00
auto_policy.py [shardformer] rewrite tests for opt/bloom/llama/vit/chatglm (#4395) 2023-08-15 23:25:14 +08:00
base_policy.py [shardformer/sequence parallel] Cherry pick commit to new branch (#4450) 2023-08-16 15:41:20 +08:00
bert.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
blip2.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
bloom.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
chatglm.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
gpt2.py [shardformer/sequence parallel] support gpt2 seq parallel with pp/dp/tp (#4460) 2023-08-18 11:21:53 +08:00
llama.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
opt.py [shardformer] rewrite tests for opt/bloom/llama/vit/chatglm (#4395) 2023-08-15 23:25:14 +08:00
sam.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
t5.py [pipeline] rewrite t5 tests & support multi-tensor transmitting in pipeline (#4388) 2023-08-15 23:25:14 +08:00
vit.py [format] applied code formatting on changed files in pull request 4441 (#4445) 2023-08-16 10:47:23 +08:00
whisper.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00