ColossalAI/colossalai/shardformer/policies
Baizhou Zhang ed4c448488 [pipeline] rewrite t5 tests & support multi-tensor transmitting in pipeline (#4388)
* fix remaining t5 bugs/rewrite t5 tests

* fix multi-tensor communication in pipeline

* rearrange test_config

* fix keyerror in sync_shared_params

* fix get_held_layers & Randomnizer, complete t5 tests

* erase printing

* fix get_held_layers through modifying _release_unheld_layers

* fix _get_recursive_held_layers bug
2023-08-15 23:25:14 +08:00
..
__init__.py [shardformer] init shardformer code structure (#3731) 2023-07-04 16:05:01 +08:00
auto_policy.py [shardformer] support Blip2 (#4243) 2023-08-15 23:25:14 +08:00
base_policy.py [shardformer] fix base policy (#4229) 2023-08-15 23:25:14 +08:00
bert.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
blip2.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
bloom.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
chatglm.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
gpt2.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
llama.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
opt.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
sam.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
t5.py [pipeline] rewrite t5 tests & support multi-tensor transmitting in pipeline (#4388) 2023-08-15 23:25:14 +08:00
vit.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00
whisper.py [Shardformer] Merge flash attention branch to pipeline branch (#4362) 2023-08-15 23:25:14 +08:00