ColossalAI/tests/test_shardformer
Wang Binluo a3cc68ca93
[Shardformer] Support the Qwen2 model (#5699)
* feat: support qwen2 model

* fix: modify model config and add Qwen2RMSNorm

* fix qwen2 model conflicts

* test: add qwen2 shard test

* to: add qwen2 auto policy

* support qwen model

* fix the conflicts

* add try catch

* add transformers version for qwen2

* add the ColoAttention for the qwen2 model

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add the unit test version check

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix the test input bug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix the version check

* fix the version check

---------

Co-authored-by: Wenhao Chen <cwher@outlook.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-05-09 20:04:25 +08:00
..
test_hybrid_parallel_grad_clip_norm [misc] refactor launch API and tensor constructor (#5666) 2024-04-29 10:40:11 +08:00
test_layer [misc] refactor launch API and tensor constructor (#5666) 2024-04-29 10:40:11 +08:00
test_model [Shardformer] Support the Qwen2 model (#5699) 2024-05-09 20:04:25 +08:00
__init__.py [shardformer] adapted T5 and LLaMa test to use kit (#4049) 2023-07-04 16:05:01 +08:00
test_flash_attention.py [coloattention]modify coloattention (#5627) 2024-04-25 10:47:14 +08:00
test_shard_utils.py [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_with_torch_ddp.py [misc] refactor launch API and tensor constructor (#5666) 2024-04-29 10:40:11 +08:00