..
__init__.py
[shardformer] init shardformer code structure ( #3731 )
2023-07-04 16:05:01 +08:00
auto_policy.py
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert ( #5088 )
2023-11-28 16:54:42 +08:00
base_policy.py
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert ( #5088 )
2023-11-28 16:54:42 +08:00
bert.py
[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp ( #5134 )
2023-12-22 10:44:00 +08:00
blip2.py
[hotfix] Add layer norm gradients all-reduce for sequence parallel ( #4926 )
2023-11-03 13:32:43 +08:00
bloom.py
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert ( #5088 )
2023-11-28 16:54:42 +08:00
chatglm2.py
[Inference] Fix bug in ChatGLM2 Tensor Parallelism ( #5014 )
2023-11-07 15:01:50 +08:00
falcon.py
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert ( #5088 )
2023-11-28 16:54:42 +08:00
gpt2.py
[hotfix] Add layer norm gradients all-reduce for sequence parallel ( #4926 )
2023-11-03 13:32:43 +08:00
gptj.py
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert ( #5088 )
2023-11-28 16:54:42 +08:00
llama.py
[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp ( #5134 )
2023-12-22 10:44:00 +08:00
mistral.py
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert ( #5088 )
2023-11-28 16:54:42 +08:00
opt.py
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert ( #5088 )
2023-11-28 16:54:42 +08:00
sam.py
[hotfix] Add layer norm gradients all-reduce for sequence parallel ( #4926 )
2023-11-03 13:32:43 +08:00
t5.py
[gemini] gemini support tensor parallelism. ( #4942 )
2023-11-10 10:15:16 +08:00
vit.py
[hotfix] Add layer norm gradients all-reduce for sequence parallel ( #4926 )
2023-11-03 13:32:43 +08:00
whisper.py
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert ( #5088 )
2023-11-28 16:54:42 +08:00