InternLM/internlm
Wenwen Qu dccdfc7e4e support mixtral-7x8b 2024-01-16 19:23:11 +08:00
..
apis feat(tools): support origin internlm architecture in web_demo (#478) 2023-11-09 20:01:55 +08:00
core add output embedding tf32 option (#523) 2023-12-06 13:50:59 +08:00
data fix the type_ids when micro_num=1 and use_flash_attn=False (#516) 2023-12-06 14:38:28 +08:00
initialize feat(grad_norm): vocab grad norm profiling (#519) 2023-12-06 13:52:42 +08:00
model support mixtral-7x8b 2024-01-16 19:23:11 +08:00
moe fix(moe): remove norm&gate force sync (#448) 2023-11-01 11:29:55 +08:00
monitor fix(alert): send exception of all ranks (#491) 2023-11-10 19:04:31 +08:00
solver feat(grad_norm): vocab grad norm profiling (#519) 2023-12-06 13:52:42 +08:00
train fix the type_ids when micro_num=1 and use_flash_attn=False (#516) 2023-12-06 14:38:28 +08:00
utils feat(ckpt): support auto resume in Volc and Ali (#529) 2023-12-12 13:27:24 +08:00
__init__.py initial commit 2023-07-06 12:55:23 +08:00