InternLM

History

zhanglei d218a62b79 Merge branch 'develop' of github.com:InternLM/InternLM into feature_add_moe Conflicts: internlm/core/context/parallel_context.py internlm/core/context/process_group_initializer.py internlm/model/modeling_internlm.py internlm/solver/optimizer/hybrid_zero_optim.py internlm/train/training_internlm.py internlm/utils/model_checkpoint.py train.py		2023-09-12 18:04:48 +08:00
..
__init__.py	Merge develop to main (#233 )	2023-08-24 22:03:04 +08:00
embedding.py	[fix bug] Fix the error that RotaryEmbedding is converted to a non-fp32 format during training, and add a compatible method for the llama model. (#239 )	2023-08-26 17:48:08 +08:00
linear.py	fix(model): set tensor parallel attribute for mlp (#271 )	2023-09-05 19:03:02 +08:00
loss.py	initial commit	2023-07-06 12:55:23 +08:00
metrics.py	fix(metric): argument missing in getting loss metrics. (#256 )	2023-08-31 17:44:39 +08:00
modeling_internlm.py	Merge branch 'develop' of github.com:InternLM/InternLM into feature_add_moe	2023-09-12 18:04:48 +08:00
moe.py	replace flashatten experts by feedforward experts	2023-09-08 18:04:57 +08:00
multi_head_attention.py	Merge develop to main (#233 )	2023-08-24 22:03:04 +08:00
norm.py	Merge develop to main (#233 )	2023-08-24 22:03:04 +08:00
utils.py	Merge develop to main (#233 )	2023-08-24 22:03:04 +08:00