Commit Graph

11 Commits (d218a62b798e8b3338f77e2a3266e9d9d0995ecb)

Author SHA1 Message Date
zhanglei d218a62b79 Merge branch 'develop' of github.com:InternLM/InternLM into feature_add_moe
Conflicts:
	internlm/core/context/parallel_context.py
	internlm/core/context/process_group_initializer.py
	internlm/model/modeling_internlm.py
	internlm/solver/optimizer/hybrid_zero_optim.py
	internlm/train/training_internlm.py
	internlm/utils/model_checkpoint.py
	train.py
2023-09-12 18:04:48 +08:00
Wenwen Qu cd6b28b073 use dummy mode to generate random numbers in model construction 2023-09-08 17:56:42 +08:00
Guoteng 37b8c6684e
feat(utils): add timeout warpper for key functions (#286) 2023-09-07 17:26:17 +08:00
Wenwen Qu 7f687bf4b3
fix(core/context): use dummy mode to generate random numbers in model construction (#266)
* change mode to dummy in model construction and restore to data when done

* add comments

* move set_mode(.DATA) to initialize_model(.)
2023-09-06 14:34:11 +08:00
Sun Peng 860de0aa46
Feat/add runntime gpu test (#254)
* feat: add gpu bench

* feat/add allreduce runtime bench

---------

Co-authored-by: sunpengsdu <sunpengsdu@gmail.com>
2023-09-01 13:38:01 +08:00
Wenwen Qu b021995199 fix bugs 2023-08-30 16:14:33 +08:00
Wenwen Qu 629e6a5ad1 add comments for moe 2023-08-25 19:03:31 +08:00
Wenwen Qu c7f9d4f48c add expert data support and fix bugs 2023-08-10 16:07:35 +08:00
Wenwen Qu 84476833f3 modified: internlm/core/context/process_group_initializer.py
modified:   internlm/core/scheduler/no_pipeline_scheduler.py
	modified:   internlm/solver/optimizer/hybrid_zero_optim.py
2023-08-08 15:59:12 +08:00
Wenwen Qu c357288a8b feat(XXX): add moe 2023-08-07 20:17:49 +08:00
Sun Peng fa7337b37b initial commit 2023-07-06 12:55:23 +08:00