Commit Graph

9 Commits (6cf0fec314a39b90ff1a7061ddfb9d40ab0bde08)

Author SHA1 Message Date
Wenwen Qu 6cf0fec314 replace flashatten experts by feedforward experts 2023-09-08 18:04:57 +08:00
Wenwen Qu cd6b28b073 use dummy mode to generate random numbers in model construction 2023-09-08 17:56:42 +08:00
Wenwen Qu 2ad5f512b5 remove moe_loss_coeff parameter passing 2023-08-31 18:44:58 +08:00
Wenwen Qu b021995199 fix bugs 2023-08-30 16:14:33 +08:00
Wenwen Qu 629e6a5ad1 add comments for moe 2023-08-25 19:03:31 +08:00
Wenwen Qu aa2612edc4
Merge branch 'develop' into feature_add_moe 2023-08-25 13:35:56 +08:00
Guoteng 7c820cfa40
feat(init): add skip args check flag and add zero overlap flag (#222)
* feat(init): add skip args check flag

* fix(optim): add param overlap enable flag
2023-08-24 16:44:18 +08:00
Wenwen Qu 409f139ba5 merge 2023-08-24 16:38:36 +08:00
huangting4201 94b2aa28fc
Feat/example training internlm (#212)
* feat(train/training_internlm.py): move common init funcs to internlm/train

* feat(train/training_internlm.py): update some public funcs

* feat(train/training_internlm.py): update some public funcs

* feat(evaluation.py): adapt evaluate to streaming dataset

* feat(train/training_internlm.py): minor update based on comments

* fix(training_internlm.py): set train dataloader persistent_workers true only when num_worker>0

* fix(training_internlm.py): fix demo error
2023-08-24 10:00:15 +08:00