InternLM

History

Sun Peng ef851d16c6 Feat/optimizer (#194 ) * feat(optimier.py): reduce memory footprint and avoid _check_overflow call * feat(optimier.py): reduce memory footprint and avoid _check_overflow call * feat(optimizer.py): overlap compute norm with allreduce * update var and function name * update function compute norm (#197) Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu> * feat(optimizer/hybrid_zero_optim.py): overlap gradients last bucket allreduce and compute norm (#196) * support gradients allreduce and compute norm overlap * fix para set error * remove timer cal_norm for testing * feat(optimizer/hybrid_zero_optim.py): support group global norm * format(lint): fix lint error * feat(optimizer/store.py): update code based on comment --------- Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu> Co-authored-by: huangting4201 <1538303371@qq.com>		2023-08-15 18:55:10 +08:00
..
__init__.py	feat(model/metrics.py): support calculating accuracy and perplexity m… (#91 )	2023-07-26 16:22:10 +08:00
embedding.py	feat(monitor): support monitor and alert (#175 )	2023-08-08 11:18:15 +08:00
linear.py	feat(monitor): support monitor and alert (#175 )	2023-08-08 11:18:15 +08:00
loss.py	initial commit	2023-07-06 12:55:23 +08:00
metrics.py	feat(*): support not-flash-attn for pp and no-pp (#145 )	2023-07-28 16:13:04 +08:00
modeling_internlm.py	feat(monitor): support monitor and alert (#175 )	2023-08-08 11:18:15 +08:00
multi_head_attention.py	feat(*): support sequence_parallel (#180 )	2023-08-07 16:42:52 +08:00
norm.py	Feat/optimizer (#194 )	2023-08-15 18:55:10 +08:00
utils.py	Feat/optimizer (#194 )	2023-08-15 18:55:10 +08:00