.. |
__init__.py
|
[shardformer] init shardformer code structure (#3731)
|
2023-07-04 16:05:01 +08:00 |
auto_policy.py
|
[shardformer] rewrite tests for opt/bloom/llama/vit/chatglm (#4395)
|
2023-08-15 23:25:14 +08:00 |
base_policy.py
|
[shardformer/sequence parallel] Cherry pick commit to new branch (#4450)
|
2023-08-16 15:41:20 +08:00 |
bert.py
|
[Shardformer] Merge flash attention branch to pipeline branch (#4362)
|
2023-08-15 23:25:14 +08:00 |
blip2.py
|
[Shardformer] Merge flash attention branch to pipeline branch (#4362)
|
2023-08-15 23:25:14 +08:00 |
bloom.py
|
[Shardformer] Merge flash attention branch to pipeline branch (#4362)
|
2023-08-15 23:25:14 +08:00 |
chatglm.py
|
[Shardformer] Merge flash attention branch to pipeline branch (#4362)
|
2023-08-15 23:25:14 +08:00 |
gpt2.py
|
[shardformer/sequence parallel] support gpt2 seq parallel with pp/dp/tp (#4460)
|
2023-08-18 11:21:53 +08:00 |
llama.py
|
[Shardformer] Merge flash attention branch to pipeline branch (#4362)
|
2023-08-15 23:25:14 +08:00 |
opt.py
|
[shardformer] rewrite tests for opt/bloom/llama/vit/chatglm (#4395)
|
2023-08-15 23:25:14 +08:00 |
sam.py
|
[Shardformer] Merge flash attention branch to pipeline branch (#4362)
|
2023-08-15 23:25:14 +08:00 |
t5.py
|
[pipeline] rewrite t5 tests & support multi-tensor transmitting in pipeline (#4388)
|
2023-08-15 23:25:14 +08:00 |
vit.py
|
[format] applied code formatting on changed files in pull request 4441 (#4445)
|
2023-08-16 10:47:23 +08:00 |
whisper.py
|
[Shardformer] Merge flash attention branch to pipeline branch (#4362)
|
2023-08-15 23:25:14 +08:00 |