ColossalAI/colossalai/shardformer/modeling
Wang Binluo 537f6a3855
[Shardformer]fix the num_heads assert for llama model and qwen model (#5704)
* fix the num_heads assert

* fix the transformers import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix the import

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-05-10 15:33:39 +08:00
..
chatglm2_6b
__init__.py
bert.py
blip2.py
bloom.py
chatglm2.py
falcon.py
gpt2.py
gptj.py
jit.py
llama.py
mistral.py
opt.py
qwen2.py [Shardformer]fix the num_heads assert for llama model and qwen model (#5704) 2024-05-10 15:33:39 +08:00
sam.py
t5.py
vit.py
whisper.py