ColossalAI

History

flybird11111 451e9142b8 fix flash attn (#5209 )		11 months ago
..
__init__.py	[shardformer] init shardformer code structure (#3731 )	1 year ago
auto_policy.py	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )	1 year ago
base_policy.py	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )	1 year ago
bert.py	[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp (#5134 )	11 months ago
blip2.py	[hotfix] Add layer norm gradients all-reduce for sequence parallel (#4926 )	1 year ago
bloom.py	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )	1 year ago
chatglm2.py	[Inference] Fix bug in ChatGLM2 Tensor Parallelism (#5014 )	1 year ago
falcon.py	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )	1 year ago
gpt2.py	[hotfix] Add layer norm gradients all-reduce for sequence parallel (#4926 )	1 year ago
gptj.py	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )	1 year ago
llama.py	fix flash attn (#5209 )	11 months ago
mistral.py	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )	1 year ago
opt.py	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )	1 year ago
sam.py	[hotfix] Add layer norm gradients all-reduce for sequence parallel (#4926 )	1 year ago
t5.py	[gemini] gemini support tensor parallelism. (#4942 )	1 year ago
vit.py	[hotfix] Add layer norm gradients all-reduce for sequence parallel (#4926 )	1 year ago
whisper.py	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 )	1 year ago