InternLM/internlm
Wenwen Qu 6cf0fec314 replace flashatten experts by feedforward experts 2023-09-08 18:04:57 +08:00
..
apis initial commit 2023-07-06 12:55:23 +08:00
core use dummy mode to generate random numbers in model construction 2023-09-08 17:56:42 +08:00
data feat(data/utils.py): add new dataset type code for streaming dataset (#225) 2023-08-24 13:46:18 +08:00
initialize remove moe_loss_coeff parameter passing 2023-08-31 18:44:58 +08:00
model replace flashatten experts by feedforward experts 2023-09-08 18:04:57 +08:00
moe replace flashatten experts by feedforward experts 2023-09-08 18:04:57 +08:00
monitor feat(monitor): support monitor and alert (#175) 2023-08-08 11:18:15 +08:00
solver fix group_norms computing in hybrid_zero_optim 2023-08-31 18:46:13 +08:00
train replace flashatten experts by feedforward experts 2023-09-08 18:04:57 +08:00
utils remove moe_loss_coeff parameter passing 2023-08-31 18:44:58 +08:00
__init__.py initial commit 2023-07-06 12:55:23 +08:00