ColossalAI/examples/language
Hongxin Liu 014837e725
[shardformer] support pipeline for deepseek v3 and optimize lora save (#6188)
* [shardformer] support pipeline for deepseek v3

* [checkpointio] fix lora save

* [devops] update ci env

* [booster] optimize lora

* fix test

* fix test
2025-02-14 14:48:54 +08:00
..
bert [Device]Support npu (#6159) 2024-12-17 15:42:39 +08:00
commons [example] make gpt example directory more clear (#2353) 2023-01-06 11:11:26 +08:00
deepseek [shardformer] support pipeline for deepseek v3 and optimize lora save (#6188) 2025-02-14 14:48:54 +08:00
gpt [Feature] Split cross-entropy computation in SP (#5959) 2024-09-10 12:06:50 +08:00
grok-1 [misc] refactor launch API and tensor constructor (#5666) 2024-04-29 10:40:11 +08:00
llama [Device]Support npu (#6159) 2024-12-17 15:42:39 +08:00
mixtral [Zerobubble] merge main. (#6142) 2024-11-19 19:00:36 +08:00
opt [fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016) 2024-08-22 09:21:34 +08:00
palm [misc] refactor launch API and tensor constructor (#5666) 2024-04-29 10:40:11 +08:00
__init__.py [example]add gpt2 benchmark example script. (#5295) 2024-03-04 16:18:13 +08:00
data_utils.py [devops] remove post commit ci (#5566) 2024-04-08 15:09:40 +08:00
model_utils.py [example]add gpt2 benchmark example script. (#5295) 2024-03-04 16:18:13 +08:00
performance_evaluator.py [shardformer] support ep for deepseek v3 (#6185) 2025-02-11 16:10:25 +08:00