mirror of https://github.com/hpcaitech/ColossalAI
![]() * [hotfix] Add layer norm gradients all-reduce for sequence parallel. (#4915) * Add layer norm gradients all-reduce for sequence parallel. * skip pipeline inference test * [hotfix] fixing polices of sequence parallel (#4922) * Add layer norm gradients all-reduce for sequence parallel. * fix parameter passing when calling get_autopolicy --------- Co-authored-by: littsk <1214689160@qq.com> * Hotfix/add grad all reduce for sequence parallel (#4927) * Add layer norm gradients all-reduce for sequence parallel. * fix parameter passing when calling get_autopolicy * fix bug using wrong variables --------- Co-authored-by: littsk <1214689160@qq.com> * fix policy initialization * fix bloom and chatglm policices * polish code of handling layernorm * fix moe module * polish code of class initializing --------- Co-authored-by: Zhongkai Zhao <kanezz620@gmail.com> |
||
---|---|---|
.. | ||
__init__.py | ||
_utils.py | ||
test_shard_bert.py | ||
test_shard_blip2.py | ||
test_shard_bloom.py | ||
test_shard_chatglm2.py | ||
test_shard_gpt2.py | ||
test_shard_llama.py | ||
test_shard_opt.py | ||
test_shard_sam.py | ||
test_shard_t5.py | ||
test_shard_vit.py | ||
test_shard_whisper.py |