Commit Graph

9 Commits (84476833f356befe6b50f22ad162c43a7964a1b0)

Author SHA1 Message Date
Wenwen Qu 84476833f3 modified: internlm/core/context/process_group_initializer.py
modified:   internlm/core/scheduler/no_pipeline_scheduler.py
	modified:   internlm/solver/optimizer/hybrid_zero_optim.py
2023-08-08 15:59:12 +08:00
Wenwen Qu c357288a8b feat(XXX): add moe 2023-08-07 20:17:49 +08:00
ytxiong 853becfb6e
feat(*): support fp32 training (#155)
* support float32 training

* fix lint

* add adaptation in model/utils.py

* remove some unnecessary code

* fix lint

* feat(optim): add support for fp32 zero

* Revert "Merge pull request #2 from SolenoidWGT/fp32_zero"

This reverts commit 53fc50b0e5, reversing
changes made to 40f24d0a73.

revert commit

* merge develop

* Update utils.py

* support fp32 in zero optimizer

* modify the dtype

---------

Co-authored-by: wangguoteng.p <wangguoteng925@qq.com>
2023-08-04 16:05:30 +08:00
ytxiong d67be17f96
refactor(*): refactor the code with no-apex (#170)
* support no-apex

* add default for use_apex

* fix lint

* modify the RMSNormTorch

* remove some comments

* remove use_apex parameter

* remove some unnecessary code

* optimize the code including import

* remove the import RMSNorm

* remove warnings
2023-08-03 11:24:12 +08:00
ytxiong 1c397f523f
feat(*): support no apex (#166)
* support no-apex

* add default for use_apex

* fix lint

* modify the RMSNormTorch

* remove some comments

* remove use_apex parameter

* remove some unnecessary code
2023-08-02 20:32:38 +08:00
huangting4201 8b1717a05d
style(solver/optimizer/utils.py): fix lint error (#147)
Co-authored-by: huangting.p <huangting@sensetime.com>
2023-07-28 10:48:06 +08:00
Sun Peng ad10b8e03f
fix(optimizer/util.py) change inf defination 2023-07-27 10:12:51 +08:00
huangting4201 762ab297ee
feat(core/scheduler): support pipeline parallel (#98)
* feat(utils/writer.py): support tensorboard writer

* feat(utils/writer.py): add class comment

* feat(core): support pipeline parallel

* fix(core): fix demo running error

* feat(solver/optimizer): add pp zero optimizer

* fix(solver/optimizer): fix word spelling error

* feat(core/scheduler): add new dir scheduler in core/

* fix(core): fix ci lint error

* feat(solver/optimizer): merge pp and nopp optimizer

* doc(usage.md): update usage doc

* feat(core/scheduler): support post func

* feat(core/scheduler): add dtype para in pp sche and update func get_tensor_shape

* feat(core/scheduler): add _load_micro_batch in base scheduler

* feat(core/scheduler): support optimizer overlap communication in pp scheduler

* feat(core/scheduler): delete data process func code

* feat(core/trainer): schedule pre processing for all schedule

---------

Co-authored-by: 黄婷 <huangting3@CN0014010744M.local>
Co-authored-by: huangting.p <huangting@sensetime.com>
2023-07-24 20:52:09 +08:00
Sun Peng fa7337b37b initial commit 2023-07-06 12:55:23 +08:00