InternLM/internlm/model
zhanglei d218a62b79 Merge branch 'develop' of github.com:InternLM/InternLM into feature_add_moe
Conflicts:
	internlm/core/context/parallel_context.py
	internlm/core/context/process_group_initializer.py
	internlm/model/modeling_internlm.py
	internlm/solver/optimizer/hybrid_zero_optim.py
	internlm/train/training_internlm.py
	internlm/utils/model_checkpoint.py
	train.py
2023-09-12 18:04:48 +08:00
..
__init__.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00
embedding.py [fix bug] Fix the error that RotaryEmbedding is converted to a non-fp32 format during training, and add a compatible method for the llama model. (#239) 2023-08-26 17:48:08 +08:00
linear.py fix(model): set tensor parallel attribute for mlp (#271) 2023-09-05 19:03:02 +08:00
loss.py initial commit 2023-07-06 12:55:23 +08:00
metrics.py fix(metric): argument missing in getting loss metrics. (#256) 2023-08-31 17:44:39 +08:00
modeling_internlm.py Merge branch 'develop' of github.com:InternLM/InternLM into feature_add_moe 2023-09-12 18:04:48 +08:00
moe.py replace flashatten experts by feedforward experts 2023-09-08 18:04:57 +08:00
multi_head_attention.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00
norm.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00
utils.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00