ColossalAI

History

duanjunwen 4fc92aa77d [feat] support no_tp Linear for sharderformer.llama		2024-11-05 05:55:42 +00:00
..
bert	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
commons	[example] make gpt example directory more clear (#2353 )	2023-01-06 11:11:26 +08:00
deepseek	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
gpt	[zerobubble] rebase main (#6075 )	2024-10-08 15:58:00 +08:00
grok-1	[misc] refactor launch API and tensor constructor (#5666 )	2024-04-29 10:40:11 +08:00
llama	[feat] support no_tp Linear for sharderformer.llama	2024-11-05 05:55:42 +00:00
mixtral	[fix] fix llama, mixtral benchmark zbv loss none bug; update mixtral & llama policy and modeling;	2024-10-11 07:32:43 +00:00
opt	[fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016 )	2024-08-22 09:21:34 +08:00
palm	[misc] refactor launch API and tensor constructor (#5666 )	2024-04-29 10:40:11 +08:00
__init__.py	[example]add gpt2 benchmark example script. (#5295 )	2024-03-04 16:18:13 +08:00
data_utils.py	[devops] remove post commit ci (#5566 )	2024-04-08 15:09:40 +08:00
model_utils.py	[example]add gpt2 benchmark example script. (#5295 )	2024-03-04 16:18:13 +08:00
performance_evaluator.py	[feat] support meta cache, meta_grad_send, meta_tensor_send; fix runtime too long in Recv Bwd; benchmark for llama + Hybrid(tp+pp);	2024-10-24 07:30:19 +00:00