ColossalAI

History

duanjunwen f48a85e91d [fix] fix test_lora in llama policy		2024-11-15 10:27:13 +00:00
..
__init__.py	[shardformer] init shardformer code structure (#3731 )	2023-07-04 16:05:01 +08:00
auto_policy.py	[FP8] rebase main (#5963 )	2024-08-06 16:29:37 +08:00
base_policy.py	[fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016 )	2024-08-22 09:21:34 +08:00
bert.py	[feat] update mixtral policy & bert policy for zerobubble	2024-11-14 02:51:34 +00:00
blip2.py	[shardformer] fix linear 1d row and support uneven splits for fused qkv linear (#6084 )	2024-10-10 14:34:45 +08:00
bloom.py	[shardformer] optimize seq parallelism (#6086 )	2024-10-11 13:44:40 +08:00
chatglm2.py	[shardformer] optimize seq parallelism (#6086 )	2024-10-11 13:44:40 +08:00
command.py	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
deepseek.py	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
falcon.py	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
gpt2.py	[shardformer] optimize seq parallelism (#6086 )	2024-10-11 13:44:40 +08:00
gptj.py	[shardformer] optimize seq parallelism (#6086 )	2024-10-11 13:44:40 +08:00
llama.py	[fix] fix test_lora in llama policy	2024-11-15 10:27:13 +00:00
mistral.py	[feat] support mixtral policy with zbv tp_Linear & non_tp_Linear	2024-11-12 07:28:49 +00:00
mixtral.py	[fix] fix mixtral modeling & policy; update wait handles; doing benchmarking for llama hybrid;	2024-11-15 05:58:56 +00:00
opt.py	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
qwen2.py	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
sam.py	[shardformer] fix linear 1d row and support uneven splits for fused qkv linear (#6084 )	2024-10-10 14:34:45 +08:00
t5.py	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
vit.py	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
whisper.py	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00