ColossalAI

History

Wang Binluo dcd41d0973 Merge pull request #6071 from wangbluo/ring_attention [Ring Attention] fix the 2d ring attn when using multiple machine		2024-10-15 15:17:21 +08:00
..
test_hybrid_parallel_grad_clip_norm	[MoE/ZeRO] Moe refactor with zero refactor (#5821 )	2024-06-28 14:00:08 +08:00
test_layer	Merge pull request #6071 from wangbluo/ring_attention	2024-10-15 15:17:21 +08:00
test_model	[moe] add parallel strategy for shared_expert && fix test for deepseek (#6063 )	2024-09-18 10:09:01 +08:00
__init__.py	[shardformer] adapted T5 and LLaMa test to use kit (#4049 )	2023-07-04 16:05:01 +08:00
test_flash_attention.py	[Feature] Zigzag Ring attention (#5905 )	2024-08-16 13:56:38 +08:00
test_shard_utils.py	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
test_with_torch_ddp.py	[misc] refactor launch API and tensor constructor (#5666 )	2024-04-29 10:40:11 +08:00