zhanglei
|
d218a62b79
|
Merge branch 'develop' of github.com:InternLM/InternLM into feature_add_moe
Conflicts:
internlm/core/context/parallel_context.py
internlm/core/context/process_group_initializer.py
internlm/model/modeling_internlm.py
internlm/solver/optimizer/hybrid_zero_optim.py
internlm/train/training_internlm.py
internlm/utils/model_checkpoint.py
train.py
|
2023-09-12 18:04:48 +08:00 |
Wenwen Qu
|
cd6b28b073
|
use dummy mode to generate random numbers in model construction
|
2023-09-08 17:56:42 +08:00 |
Guoteng
|
37b8c6684e
|
feat(utils): add timeout warpper for key functions (#286)
|
2023-09-07 17:26:17 +08:00 |
Wenwen Qu
|
7f687bf4b3
|
fix(core/context): use dummy mode to generate random numbers in model construction (#266)
* change mode to dummy in model construction and restore to data when done
* add comments
* move set_mode(.DATA) to initialize_model(.)
|
2023-09-06 14:34:11 +08:00 |
Sun Peng
|
860de0aa46
|
Feat/add runntime gpu test (#254)
* feat: add gpu bench
* feat/add allreduce runtime bench
---------
Co-authored-by: sunpengsdu <sunpengsdu@gmail.com>
|
2023-09-01 13:38:01 +08:00 |
Wenwen Qu
|
b021995199
|
fix bugs
|
2023-08-30 16:14:33 +08:00 |
Wenwen Qu
|
629e6a5ad1
|
add comments for moe
|
2023-08-25 19:03:31 +08:00 |
Wenwen Qu
|
c7f9d4f48c
|
add expert data support and fix bugs
|
2023-08-10 16:07:35 +08:00 |
Wenwen Qu
|
84476833f3
|
modified: internlm/core/context/process_group_initializer.py
modified: internlm/core/scheduler/no_pipeline_scheduler.py
modified: internlm/solver/optimizer/hybrid_zero_optim.py
|
2023-08-08 15:59:12 +08:00 |
Wenwen Qu
|
c357288a8b
|
feat(XXX): add moe
|
2023-08-07 20:17:49 +08:00 |
Sun Peng
|
fa7337b37b
|
initial commit
|
2023-07-06 12:55:23 +08:00 |