Merge branch 'zww/update-internlm2.5' into 'main'

Zww/update internlm2.5

See merge request openmmlab/bigmodel/InternLM!9
pull/752/head
zhangwenwei 2024-06-30 15:36:27 +00:00
commit 25e550de74
16 changed files with 80 additions and 168 deletions

View File

@ -36,18 +36,18 @@
## Introduction
InternLM2 series are released with the following features:
InternLM2.5 series are released with the following features:
- **200K Context window**: Nearly perfect at finding needles in the haystack with 200K-long context, with leading performance on long-context tasks like LongBench and L-Eval. Try it with [LMDeploy](./chat/lmdeploy.md) for 200K-context inference.
- **Outstanding reasoning capability**: State-of-the-art performance on Math reasoning, surpassing models like Llama3 and Gemma2-9B.
- **Outstanding comprehensive performance**: Significantly better than the last generation in all dimensions, especially in reasoning, math, code, chat experience, instruction following, and creative writing, with leading performance among open-source models in similar sizes. In some evaluations, InternLM2-Chat-20B may match or even surpass ChatGPT (GPT-3.5).
- **1M Context window**: Nearly perfect at finding needles in the haystack with 1M-long context, with leading performance on long-context tasks like LongBench. Try it with [LMDeploy](./chat/lmdeploy.md) for 1M-context inference.
- **Code interpreter & Data analysis**: With code interpreter, InternLM2-Chat-20B obtains compatible performance with GPT-4 on GSM8K and MATH. InternLM2-Chat also provides data analysis capability.
- **Stronger tool use**: Based on better tool utilization-related capabilities in instruction following, tool selection and reflection, InternLM2 can support more kinds of agents and multi-step tool calling for complex tasks. See [examples](./agent/).
- **Stronger tool use**: InternLM2.5 supports gathering information from more than 100 web pages, corresponding implementation will be released in [Lagent](https://github.com/InternLM/lagent/tree/main) soon. InternLM2.5 has better tool utilization-related capabilities in instruction following, tool selection and reflection. See [examples](./agent/).
## News
\[2024.06.30\] We release InternLM2.5-7B, InternLM2.5-7B-Chat and InternLM2.5-7B-Chat-1M. See [model zoo below](#model-zoo) for download or [model cards](./model_cards/) for more details.
\[2024.03.26\] We release InternLM2 technical report. See [arXiv](https://arxiv.org/abs/2403.17297) for details.
\[2024.01.31\] We release InternLM2-1.8B, along with the associated chat model. They provide a cheaper deployment option while maintaining leading performance.
@ -64,28 +64,17 @@ InternLM2 series are released with the following features:
| Model | Transformers(HF) | ModelScope(HF) | OpenXLab(HF) | OpenXLab(Origin) | Release Date |
| --------------------------- | ----------------------------------------- | ---------------------------------------- | -------------------------------------- | ------------------------------------------ | ------------ |
| **InternLM2-1.8B** | [🤗internlm2-1.8b](https://huggingface.co/internlm/internlm2-1_8b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-1.8b](https://www.modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-1_8b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-1.8b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-1.8b-original) | 2024-01-31 |
| **InternLM2-Chat-1.8B-SFT** | [🤗internlm2-chat-1.8b-sft](https://huggingface.co/internlm/internlm2-chat-1_8b-sft) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-1.8b-sft](https://www.modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-1_8b-sft/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-1.8b-sft) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-1.8b-sft-original) | 2024-01-31 |
| **InternLM2-Chat-1.8B** | [🤗internlm2-chat-1.8b](https://huggingface.co/internlm/internlm2-chat-1_8b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-1.8b](https://www.modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-1_8b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-1.8b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-1.8b-original) | 2024-02-19 |
| **InternLM2-Base-7B** | [🤗internlm2-base-7b](https://huggingface.co/internlm/internlm2-base-7b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-7b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-7b-original) | 2024-01-17 |
| **InternLM2-7B** | [🤗internlm2-7b](https://huggingface.co/internlm/internlm2-7b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b-original) | 2024-01-17 |
| **InternLM2-Chat-7B-SFT** | [🤗internlm2-chat-7b-sft](https://huggingface.co/internlm/internlm2-chat-7b-sft) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-7b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b-sft/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b-sft) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b-sft-original) | 2024-01-17 |
| **InternLM2-Chat-7B** | [🤗internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b-original) | 2024-01-17 |
| **InternLM2-Base-20B** | [🤗internlm2-base-20b](https://huggingface.co/internlm/internlm2-base-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-20b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-20b-original) | 2024-01-17 |
| **InternLM2-20B** | [🤗internlm2-20b](https://huggingface.co/internlm/internlm2-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-20b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-20b-original) | 2024-01-17 |
| **InternLM2-Chat-20B-SFT** | [🤗internlm2-chat-20b-sft](https://huggingface.co/internlm/internlm2-chat-20b-sft) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-20b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b-sft/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b-sft) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b-sft-original) | 2024-01-17 |
| **InternLM2-Chat-20B** | [🤗internlm2-chat-20b](https://huggingface.co/internlm/internlm2-chat-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b-original) | 2024-01-17 |
| **InternLM2.5-7B** | [🤗internlm2_5-7b](https://huggingface.co/internlm/internlm2_5-7b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b-original) | 2024-06-30 |
| **InternLM2.5-7B-Chat** | [🤗internlm2_5-7b-chat](https://huggingface.co/internlm/internlm2_5-7B-chat) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2_5-7b-chat-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2_5-7B-chat/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2_5-7B-chat) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2_5-7B-chat-original) | 2024-06-30 |
| **InternLM2.5-7B-Chat-1M** | [🤗internlm2_5-7b-chat](https://huggingface.co/internlm/internlm2_5-7b-chat) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2_5-7b-chat](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2_5-7b-chat/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2_5-7b-chat) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2_5-7b-chat-original) | 2024-06-30 |
**Notes:**
The release of InternLM2 series contains two model sizes: 7B and 20B. 7B models are efficient for research and application and 20B models are more powerful and can support more complex scenarios. The relation of these models are shown as follows.
The release of InternLM2.5 series contains 7B model size for now and we are going to release the 1.8B and 20B versions soon. 7B models are efficient for research and application and 20B models are more powerful and can support more complex scenarios. The relation of these models are shown as follows.
![](https://internlm.oss-cn-shanghai.aliyuncs.com/series.png)
1. **InternLM2-Base**: Foundation models with high quality and high adaptation flexibility, which serve as a good starting point for downstream deep adaptations.
2. **InternLM2**: Further pretrain with general domain data and domain-enhanced corpus, obtaining state-of-the-art performance in evaluation with good language capability. InternLM2 models are recommended for consideration in most applications.
3. **InternLM2-Chat-SFT**: Intermediate version of InternLM2-Chat that only undergoes supervised fine-tuning (SFT), based on the InternLM2-Base model. We release them to benefit research on alignment.
4. **InternLM2-Chat**: Further aligned on top of InternLM2-Chat-SFT through online RLHF. InternLM2-Chat exhibits better instruction following, chat experience, and function call, which is recommended for downstream applications.
1. **InternLM2.5**: Foundation models pre-trained on large-scale corpus. InternLM2.5 models are recommended for consideration in most applications.
2. **InternLM2.5-Chat**: The Chat model that undergoes supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), based on the InternLM2.5 model. InternLM2.5-Chat is optimized for instruction following, chat experience, and function call, which is recommended for downstream applications.
3. **InternLM2.5-Chat-1M**: InternLM2.5-Chat-1M supports 1M long-context with compatible performance as InternLM2.5-Chat.
**Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.
@ -95,45 +84,8 @@ The release of InternLM2 series contains two model sizes: 7B and 20B. 7B models
### Objective Evaluation
| Dataset | Baichuan2-7B-Chat | Mistral-7B-Instruct-v0.2 | Qwen-7B-Chat | InternLM2-Chat-7B | ChatGLM3-6B | Baichuan2-13B-Chat | Mixtral-8x7B-Instruct-v0.1 | Qwen-14B-Chat | InternLM2-Chat-20B |
| ---------------- | ----------------- | ------------------------ | ------------ | ----------------- | ----------- | ------------------ | -------------------------- | ------------- | ------------------ |
| MMLU | 50.1 | 59.2 | 57.1 | 63.7 | 58.0 | 56.6 | 70.3 | 66.7 | 66.5 |
| CMMLU | 53.4 | 42.0 | 57.9 | 63.0 | 57.8 | 54.8 | 50.6 | 68.1 | 65.1 |
| AGIEval | 35.3 | 34.5 | 39.7 | 47.2 | 44.2 | 40.0 | 41.7 | 46.5 | 50.3 |
| C-Eval | 53.9 | 42.4 | 59.8 | 60.8 | 59.1 | 56.3 | 54.0 | 71.5 | 63.0 |
| TrivialQA | 37.6 | 35.0 | 46.1 | 50.8 | 38.1 | 40.3 | 57.7 | 54.5 | 53.9 |
| NaturalQuestions | 12.8 | 8.1 | 18.6 | 24.1 | 14.0 | 12.7 | 22.5 | 22.9 | 25.9 |
| C3 | 78.5 | 66.9 | 84.4 | 91.5 | 79.3 | 84.4 | 82.1 | 91.5 | 93.5 |
| CMRC | 8.1 | 5.6 | 14.6 | 63.8 | 43.2 | 27.8 | 5.3 | 13.0 | 50.4 |
| WinoGrande | 49.9 | 50.8 | 54.2 | 65.8 | 61.7 | 50.9 | 60.9 | 55.7 | 74.8 |
| BBH | 35.9 | 46.5 | 45.5 | 61.2 | 56.0 | 42.5 | 57.3 | 55.8 | 68.3 |
| GSM-8K | 32.4 | 48.3 | 44.1 | 70.7 | 53.8 | 56.0 | 71.7 | 57.7 | 79.6 |
| Math | 5.7 | 8.6 | 12.0 | 23.0 | 20.4 | 4.3 | 22.5 | 27.6 | 31.9 |
| HumanEval | 17.7 | 35.4 | 36.0 | 59.8 | 52.4 | 19.5 | 37.8 | 40.9 | 67.1 |
| MBPP | 37.7 | 25.7 | 33.9 | 51.4 | 55.6 | 40.9 | 40.9 | 30.0 | 65.8 |
- Performance of MBPP is reported with MBPP(Sanitized)
### Alignment Evaluation
- We have evaluated our model on [AlpacaEval 2.0](https://tatsu-lab.github.io/alpaca_eval/) and InternLM2-Chat-20B surpass Claude 2, GPT-4(0613) and Gemini Pro.
| Model Name | Win Rate | Length |
| ------------------ | -------- | ------ |
| GPT-4 Turbo | 50.00% | 2049 |
| GPT-4 | 23.58% | 1365 |
| GPT-4 0314 | 22.07% | 1371 |
| Mistral Medium | 21.86% | 1500 |
| XwinLM 70b V0.1 | 21.81% | 1775 |
| InternLM2 Chat 20B | 21.75% | 2373 |
| Mixtral 8x7B v0.1 | 18.26% | 1465 |
| Claude 2 | 17.19% | 1069 |
| Gemini Pro | 16.85% | 1315 |
| GPT-4 0613 | 15.76% | 1140 |
| Claude 2.1 | 15.73% | 1096 |
- According to the released performance of 2024-01-17.
## Requirements
- Python >= 3.8
@ -157,9 +109,9 @@ To load the InternLM2-7B-Chat model using Transformers, use the following code:
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2_5-7b-chat", trust_remote_code=True)
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", device_map="auto", trust_remote_code=True, torch_dtype=torch.float16)
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2_5-7b-chat", device_map="auto", trust_remote_code=True, torch_dtype=torch.float16)
# (Optional) If on low resource devices, you can load model in 4-bit or 8-bit to further save GPU memory via bitsandbytes.
# InternLM 7B in 4bit will cost nearly 8GB GPU memory.
# pip install -U bitsandbytes
@ -180,7 +132,7 @@ To load the InternLM2-7B-Chat model using ModelScope, use the following code:
```python
import torch
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b')
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2_5-7b-chat')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True)
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, torch_dtype=torch.float16)
@ -210,26 +162,26 @@ streamlit run ./chat/web_demo.py
We use [LMDeploy](https://github.com/InternLM/LMDeploy) for fast deployment of InternLM.
With only 4 lines of codes, you can perform `internlm2-chat-7b` inference after `pip install lmdeploy>=0.2.1`.
With only 4 lines of codes, you can perform `internlm2_5-7b-chat` inference after `pip install lmdeploy>=0.2.1`.
```python
from lmdeploy import pipeline
pipe = pipeline("internlm/internlm2-chat-7b")
pipe = pipeline("internlm/internlm2_5-7b-chat")
response = pipe(["Hi, pls intro yourself", "Shanghai is"])
print(response)
```
Please refer to the [guidance](./chat/lmdeploy.md) for more usages about model deployment. For additional deployment tutorials, feel free to explore [here](https://github.com/InternLM/LMDeploy).
### 200K-long-context Inference
### 1M-long-context Inference
By enabling the Dynamic NTK feature of LMDeploy, you can acquire the long-context inference power.
```python
from lmdeploy import pipeline, GenerationConfig, TurbomindEngineConfig
backend_config = TurbomindEngineConfig(rope_scaling_factor=2.0, session_len=200000)
pipe = pipeline('internlm/internlm2-chat-7b', backend_config=backend_config)
backend_config = TurbomindEngineConfig(rope_scaling_factor=2.5, session_len=1048576)
pipe = pipeline('internlm/internlm2_5-7b-chat-1m', backend_config=backend_config)
prompt = 'Use a long prompt to replace this sentence'
response = pipe(prompt)
print(response)
@ -237,7 +189,7 @@ print(response)
## Agent
InternLM2-Chat models have excellent tool utilization capabilities and can work with function calls in a zero-shot manner. See more examples in [agent session](./agent/).
InternLM2.5-Chat models have excellent tool utilization capabilities and can work with function calls in a zero-shot manner. It also supports to conduct analysis by collecting information from more than 100 web pages. See more examples in [agent session](./agent/).
## Fine-tuning
@ -247,7 +199,7 @@ Please refer to [finetune docs](./finetune/) for fine-tuning with InternLM.
## Evaluation
We utilize [OpenCompass](https://github.com/open-compass/opencompass) for model evaluation. In InternLM-2, we primarily focus on standard objective evaluation, long-context evaluation (needle in a haystack), data contamination assessment, agent evaluation, and subjective evaluation.
We utilize [OpenCompass](https://github.com/open-compass/opencompass) for model evaluation. In InternLM2.5, we primarily focus on standard objective evaluation, long-context evaluation (needle in a haystack), data contamination assessment, agent evaluation, and subjective evaluation.
### Objective Evaluation

View File

@ -36,15 +36,16 @@
## 简介
InternLM2 系列模型在本仓库正式发布,具有如下特性:
InternLM2.5 系列模型在本仓库正式发布,具有如下特性:
- 有效支持20万字超长上下文模型在 20 万字长输入中几乎完美地实现长文“大海捞针”,而且在 LongBench 和 L-Eval 等长文任务中的表现也达到开源模型中的领先水平。 可以通过 [LMDeploy](./chat/lmdeploy_zh_cn.md) 尝试20万字超长上下文推理。
- 综合性能全面提升:各能力维度相比上一代模型全面进步,在推理、数学、代码、对话体验、指令遵循和创意写作等方面的能力提升尤为显著,综合性能达到同量级开源模型的领先水平,在重点能力评测上 InternLM2-Chat-20B 能比肩甚至超越 ChatGPT GPT-3.5)。
- 代码解释器与数据分析在配合代码解释器code-interpreter的条件下InternLM2-Chat-20B 在 GSM8K 和 MATH 上可以达到和 GPT-4 相仿的水平。基于在数理和工具方面强大的基础能力InternLM2-Chat 提供了实用的数据分析能力。
- 工具调用能力整体升级:基于更强和更具有泛化性的指令理解、工具筛选与结果反思等能力,新版模型可以更可靠地支持复杂智能体的搭建,支持对工具进行有效的多轮调用,完成较复杂的任务。可以查看更多[样例](./agent/)。
- 卓越的推理性能:在数学推理方面取得了同量级模型最优精度,超越了 Llama3 和 Gemma2-9B。
- 有效支持百万字超长上下文:模型在 1 百万字长输入中几乎完美地实现长文“大海捞针”,而且在 LongBench 等长文任务中的表现也达到开源模型中的领先水平。 可以通过 [LMDeploy](./chat/lmdeploy_zh_cn.md) 尝试百万字超长上下文推理。
- 工具调用能力整体升级InternLM2.5 支持从上百个网页搜集有效信息进行分析推理,相关实现将于近期开源到 [Lagent](https://github.com/InternLM/lagent/tree/main)。InternLM2.5 具有更强和更具有泛化性的指令理解、工具筛选与结果反思等能力,新版模型可以更可靠地支持复杂智能体的搭建,支持对工具进行有效的多轮调用,完成较复杂的任务。可以查看更多[样例](./agent/)。
## 更新
\[2024.06.30\] 我们发布了 InternLM2.5-7B、InternLM2.5-7B-Chat 和 InternLM2.5-7B-Chat-1M.。可以在下方的 [模型库](#model-zoo) 进行下载,或者在 [model cards](./model_cards/) 中了解更多细节。
\[2024.03.26\] 我们发布了 InternLM2 的技术报告。 可以点击 [arXiv链接](https://arxiv.org/abs/2403.17297) 来了解更多细节。
\[2024.01.31\] 我们发布了 InternLM2-1.8B,以及相关的对话模型。该模型在保持领先性能的情况下,提供了更低廉的部署方案。
@ -61,28 +62,19 @@ InternLM2 系列模型在本仓库正式发布,具有如下特性:
| Model | Transformers(HF) | ModelScope(HF) | OpenXLab(HF) | OpenXLab(Origin) | Release Date |
| --------------------------- | ----------------------------------------- | ---------------------------------------- | -------------------------------------- | ------------------------------------------ | ------------ |
| **InternLM2-1.8B** | [🤗internlm2-1.8b](https://huggingface.co/internlm/internlm2-1_8b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-1.8b](https://www.modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-1_8b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-1.8b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-1.8b-original) | 2024-01-31 |
| **InternLM2-Chat-1.8B-SFT** | [🤗internlm2-chat-1.8b-sft](https://huggingface.co/internlm/internlm2-chat-1_8b-sft) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-1.8b-sft](https://www.modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-1_8b-sft/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-1.8b-sft) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-1.8b-sft-original) | 2024-01-31 |
| **InternLM2-Chat-1.8B** | [🤗internlm2-chat-1.8b](https://huggingface.co/internlm/internlm2-chat-1_8b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-1.8b](https://www.modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-1_8b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-1.8b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-1.8b-original) | 2024-02-19 |
| **InternLM2-Base-7B** | [🤗internlm2-base-7b](https://huggingface.co/internlm/internlm2-base-7b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-7b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-7b-original) | 2024-01-17 |
| **InternLM2-7B** | [🤗internlm2-7b](https://huggingface.co/internlm/internlm2-7b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b-original) | 2024-01-17 |
| **InternLM2-Chat-7B-SFT** | [🤗internlm2-chat-7b-sft](https://huggingface.co/internlm/internlm2-chat-7b-sft) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-7b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b-sft/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b-sft) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b-sft-original) | 2024-01-17 |
| **InternLM2-Chat-7B** | [🤗internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b-original) | 2024-01-17 |
| **InternLM2-Base-20B** | [🤗internlm2-base-20b](https://huggingface.co/internlm/internlm2-base-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-20b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-20b-original) | 2024-01-17 |
| **InternLM2-20B** | [🤗internlm2-20b](https://huggingface.co/internlm/internlm2-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-20b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-20b-original) | 2024-01-17 |
| **InternLM2-Chat-20B-SFT** | [🤗internlm2-chat-20b-sft](https://huggingface.co/internlm/internlm2-chat-20b-sft) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-20b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b-sft/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b-sft) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b-sft-original) | 2024-01-17 |
| **InternLM2-Chat-20B** | [🤗internlm2-chat-20b](https://huggingface.co/internlm/internlm2-chat-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b-original) | 2024-01-17 |
| **InternLM2.5-7B** | [🤗internlm2_5-7b](https://huggingface.co/internlm/internlm2_5-7b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b-original) | 2024-06-30 |
| **InternLM2.5-7B-Chat** | [🤗internlm2_5-7b-chat](https://huggingface.co/internlm/internlm2_5-7B-chat) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2_5-7b-chat-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2_5-7B-chat/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2_5-7B-chat) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2_5-7B-chat-original) | 2024-06-30 |
| **InternLM2.5-7B-Chat-1M** | [🤗internlm2_5-7b-chat](https://huggingface.co/internlm/internlm2_5-7b-chat) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2_5-7b-chat](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2_5-7b-chat/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2_5-7b-chat) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2_5-7b-chat-original) | 2024-06-30 |
**模型说明:**
在此次发布中InternLM2 包含两种模型规格7B 和 20B。7B 为轻量级的研究和应用提供了一个轻便但性能不俗的模型20B 模型的综合性能更为强劲,可以有效支持更加复杂的实用场景。每个规格不同模型关系如下图所示:
目前 InternLM 2.5 系列只发布了 7B 大小的模型,我们接下来将开源 1.8B 和 20B 的版本。7B 为轻量级的研究和应用提供了一个轻便但性能不俗的模型20B 模型的综合性能更为强劲,可以有效支持更加复杂的实用场景。每个规格不同模型关系如下图所示:
![](https://internlm.oss-cn-shanghai.aliyuncs.com/series.png)
1. **InternLM2-Base**:高质量和具有很强可塑性的模型基座,是模型进行深度领域适配的高质量起点。
2. **InternLM2**:进一步在大规模无标签数据上进行预训练,并结合特定领域的增强语料库进行训练,在评测中成绩优异,同时保持了很好的通用语言能力,是我们推荐的在大部分应用中考虑选用的优秀基座。
3. **InternLM2-Chat-SFT**: 基于 InternLM2-Base 模型进行了有监督微调,是 InternLM2-Chat 模型的中间版本。我们将它们开源以助力社区在对齐方面的研究。
4. **InternLM2-Chat**: 在 InternLM2-Chat-SFT 的基础上进行了 online RLHF 以进一步对齐. InternLM2-Chat 面向对话交互进行了优化,具有较好的指令遵循、共情聊天和调用工具等的能力,是我们推荐直接用于下游应用的模型。
1. **InternLM2.5**:经历了大规模预训练的基座模型,是我们推荐的在大部分应用中考虑选用的优秀基座。
2. **InternLM2.5-Chat**: 对话模型,在 InternLM2.5 基座上经历了有监督微调和 online RLHF。InternLM2。5-Chat 面向对话交互进行了优化,具有较好的指令遵循、共情聊天和调用工具等的能力,是我们推荐直接用于下游应用的模型。
3. **InternLM2.5-Chat-1M**: InternLM2.5-Chat-1M 支持一百万字超长上下文,并具有和 InternLM2.5-Chat 相当的综合性能表现。
**局限性:** 尽管在训练过程中我们非常注重模型的安全性,尽力促使模型输出符合伦理和法律要求的文本,但受限于模型大小以及概率生成范式,模型可能会产生各种不符合预期的输出,例如回复内容包含偏见、歧视等有害内容,请勿传播这些内容。由于传播不良信息导致的任何后果,本项目不承担责任。
@ -92,45 +84,8 @@ InternLM2 系列模型在本仓库正式发布,具有如下特性:
### 客观评测
| Dataset | Baichuan2-7B-Chat | Mistral-7B-Instruct-v0.2 | Qwen-7B-Chat | InternLM2-Chat-7B | ChatGLM3-6B | Baichuan2-13B-Chat | Mixtral-8x7B-Instruct-v0.1 | Qwen-14B-Chat | InternLM2-Chat-20B |
| ---------------- | ----------------- | ------------------------ | ------------ | ----------------- | ----------- | ------------------ | -------------------------- | ------------- | ------------------ |
| MMLU | 50.1 | 59.2 | 57.1 | 63.7 | 58.0 | 56.6 | 70.3 | 66.7 | 66.5 |
| CMMLU | 53.4 | 42.0 | 57.9 | 63.0 | 57.8 | 54.8 | 50.6 | 68.1 | 65.1 |
| AGIEval | 35.3 | 34.5 | 39.7 | 47.2 | 44.2 | 40.0 | 41.7 | 46.5 | 50.3 |
| C-Eval | 53.9 | 42.4 | 59.8 | 60.8 | 59.1 | 56.3 | 54.0 | 71.5 | 63.0 |
| TrivialQA | 37.6 | 35.0 | 46.1 | 50.8 | 38.1 | 40.3 | 57.7 | 54.5 | 53.9 |
| NaturalQuestions | 12.8 | 8.1 | 18.6 | 24.1 | 14.0 | 12.7 | 22.5 | 22.9 | 25.9 |
| C3 | 78.5 | 66.9 | 84.4 | 91.5 | 79.3 | 84.4 | 82.1 | 91.5 | 93.5 |
| CMRC | 8.1 | 5.6 | 14.6 | 63.8 | 43.2 | 27.8 | 5.3 | 13.0 | 50.4 |
| WinoGrande | 49.9 | 50.8 | 54.2 | 65.8 | 61.7 | 50.9 | 60.9 | 55.7 | 74.8 |
| BBH | 35.9 | 46.5 | 45.5 | 61.2 | 56.0 | 42.5 | 57.3 | 55.8 | 68.3 |
| GSM-8K | 32.4 | 48.3 | 44.1 | 70.7 | 53.8 | 56.0 | 71.7 | 57.7 | 79.6 |
| Math | 5.7 | 8.6 | 12.0 | 23.0 | 20.4 | 4.3 | 22.5 | 27.6 | 31.9 |
| HumanEval | 17.7 | 35.4 | 36.0 | 59.8 | 52.4 | 19.5 | 37.8 | 40.9 | 67.1 |
| MBPP | 37.7 | 25.7 | 33.9 | 51.4 | 55.6 | 40.9 | 40.9 | 30.0 | 65.8 |
- MBPP性能使用的是MBPP(Sanitized)版本数据集
### 主观评测
- 我们评测了InternLM2-Chat在[AlpacaEval 2.0](https://tatsu-lab.github.io/alpaca_eval/) 上的性能结果表明InternLM2-Chat在AlpacaEval上已经超过了 Claude 2, GPT-4(0613) 和 Gemini Pro.
| Model Name | Win Rate | Length |
| ------------------ | -------- | ------ |
| GPT-4 Turbo | 50.00% | 2049 |
| GPT-4 | 23.58% | 1365 |
| GPT-4 0314 | 22.07% | 1371 |
| Mistral Medium | 21.86% | 1500 |
| XwinLM 70b V0.1 | 21.81% | 1775 |
| InternLM2 Chat 20B | 21.75% | 2373 |
| Mixtral 8x7B v0.1 | 18.26% | 1465 |
| Claude 2 | 17.19% | 1069 |
| Gemini Pro | 16.85% | 1315 |
| GPT-4 0613 | 15.76% | 1140 |
| Claude 2.1 | 15.73% | 1096 |
- 性能数据截止2024-01-17
## 依赖
- Python >= 3.8
@ -154,9 +109,9 @@ transformers >= 4.38
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2_5-7b-chat", trust_remote_code=True)
# 设置`torch_dtype=torch.float16`来将模型精度指定为torch.float16否则可能会因为您的硬件原因造成显存不足的问题。
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", device_map="auto",trust_remote_code=True, torch_dtype=torch.float16)
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2_5-7b-chat", device_map="auto",trust_remote_code=True, torch_dtype=torch.float16)
# (可选) 如果在低资源设备上可以通过bitsandbytes加载4-bit或8-bit量化的模型进一步节省GPU显存.
# 4-bit 量化的 InternLM 7B 大约会消耗 8GB 显存.
# pip install -U bitsandbytes
@ -177,7 +132,7 @@ print(response)
```python
import torch
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b')
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2_5-7b-chat')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, torch_dtype=torch.float16)
# (可选) 如果在低资源设备上可以通过bitsandbytes加载4-bit或8-bit量化的模型进一步节省GPU显存.
@ -210,27 +165,31 @@ streamlit run ./chat/web_demo.py
```python
from lmdeploy import pipeline
pipe = pipeline("internlm/internlm2-chat-7b")
pipe = pipeline("internlm/internlm2_5-7b-chat")
response = pipe(["Hi, pls intro yourself", "Shanghai is"])
print(response)
```
请参考[部署指南](./chat/lmdeploy.md)了解更多使用案例,更多部署教程则可在[这里](https://github.com/InternLM/LMDeploy)找到。
### 20万字超长上下文推理
### 1百万字超长上下文推理
激活 LMDeploy 的 Dynamic NTK 能力,可以轻松把 internlm2-chat-7b 外推到 200K 上下文
激活 LMDeploy 的 Dynamic NTK 能力,可以轻松把 internlm2_5-7b-chat 外推到 200K 上下文
```python
from lmdeploy import pipeline, GenerationConfig, TurbomindEngineConfig
backend_config = TurbomindEngineConfig(rope_scaling_factor=2.0, session_len=160000)
pipe = pipeline('internlm/internlm2-chat-7b', backend_config=backend_config)
backend_config = TurbomindEngineConfig(rope_scaling_factor=2.5, session_len=1048576)
pipe = pipeline('internlm/internlm2_5-7b-chat', backend_config=backend_config)
prompt = 'Use a long prompt to replace this sentence'
response = pipe(prompt)
print(response)
```
## 智能体
InternLM-2.5-Chat 模型有出色的工具调用性能并具有一定的零样本泛化能力。它支持从上百个网页中搜集信息并进行分析。更多样例可以参考 [agent 目录](./agent/).
## 微调&训练
请参考[微调教程](./finetune/)尝试续训或微调 InternLM2。
@ -239,7 +198,7 @@ print(response)
## 评测
我们使用 [OpenCompass](https://github.com/open-compass/opencompass) 进行模型评估。在 InternLM-2 中,我们主要标准客观评估、长文评估(大海捞针)、数据污染评估、智能体评估和主观评估。
我们使用 [OpenCompass](https://github.com/open-compass/opencompass) 进行模型评估。在 InternLM2.5 中,我们主要标准客观评估、长文评估(大海捞针)、数据污染评估、智能体评估和主观评估。
### 标准客观评测

View File

@ -49,7 +49,7 @@ from lagent.actions import ActionExecutor, GoogleSearch, PythonInterpreter
from lagent.llms import HFTransformer
# Initialize the HFTransformer-based Language Model (llm) and provide the model name.
llm = HFTransformer('internlm/internlm2-chat-7b')
llm = HFTransformer('internlm/internlm2_5-7b-chat')
# Initialize the Google Search tool and provide your API key.
search_tool = GoogleSearch(api_key='Your SERPER_API_KEY')

View File

@ -73,7 +73,7 @@ def parse_args():
parser.add_argument(
'--model_path',
type=str,
default='internlm/internlm2-chat-7b',
default='internlm/internlm2_5-7b-chat',
help='Path or name to the model, could be HuggingFace model specifier.'
)
parser.add_argument(

View File

@ -12,8 +12,8 @@ To load the InternLM2 7B Chat model using Transformers, use the following code:
```python
>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True).cuda()
>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2_5-7b-chat", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm2_5-7b-chat", trust_remote_code=True).cuda()
>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "hello", history=[])
>>> print(response)
@ -36,7 +36,7 @@ To load the InternLM model using ModelScope, use the following code:
```python
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
import torch
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b')
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2_5-7b-chat')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
model = AutoModelForCausalLM.from_pretrained(model_dir,device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
model = model.eval()

View File

@ -13,8 +13,8 @@
```python
>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True).cuda()
>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2_5-7b-chat", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm2_5-7b-chat", trust_remote_code=True).cuda()
>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "你好", history=[])
>>> print(response)
@ -30,7 +30,7 @@
```python
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
import torch
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b', revision='v1.0.0')
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2_5-7b-chat', revision='v1.0.0')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
model = AutoModelForCausalLM.from_pretrained(model_dir,device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
model = model.eval()

View File

@ -17,7 +17,7 @@ Hello<|im_end|>
Hello, I am InternLM2-Chat, how can I assist you?<|im_end|>
```
Here, `<|im_start|>` acts as the start token for each turn of dialogue, and `<|im_end|>` as the end token. Each turn of dialogue typically starts with `<|im_start|>role` and ends with the model's output `<|im_end|>`, where role represents `system`, `user`, `assistant`, and `environment`. You may refer to the [code in huggingface](https://huggingface.co/internlm/internlm2-chat-7b/blob/main/modeling_internlm2.py#L1138) to see how the chat history is organized.
Here, `<|im_start|>` acts as the start token for each turn of dialogue, and `<|im_end|>` as the end token. Each turn of dialogue typically starts with `<|im_start|>role` and ends with the model's output `<|im_end|>`, where role represents `system`, `user`, `assistant`, and `environment`. You may refer to the [code in huggingface](https://huggingface.co/internlm/internlm2_5-7b-chat/blob/main/modeling_internlm2.py#L1138) to see how the chat history is organized.
Currently, the InternLM2-Chat model's vocabulary maintains the following mappings to support full functionalities:

View File

@ -17,7 +17,7 @@ InternLM2-Chat 采用了全新的对话格式,以灵活地支持工具调用
你好,我是书生浦语,请问有什么可以帮助你的吗<|im_end|>
```
其中 `<|im_start|>` 充当了每轮对话开始符,`<|im_end|>` 充当了当前轮对话结束符。每轮对话一般以 `<|im_start|>role` 开头,以模型输出的 `<|im_end|>` 结尾role 代表 `system``user``assistant` 和 `environment` 角色。你可以参考[huggingface 上的代码](https://huggingface.co/internlm/internlm2-chat-7b/blob/main/modeling_internlm2.py#L1138)来了解对话历史的拼接。
其中 `<|im_start|>` 充当了每轮对话开始符,`<|im_end|>` 充当了当前轮对话结束符。每轮对话一般以 `<|im_start|>role` 开头,以模型输出的 `<|im_end|>` 结尾role 代表 `system``user``assistant` 和 `environment` 角色。你可以参考[huggingface 上的代码](https://huggingface.co/internlm/internlm2_5-7b-chat/blob/main/modeling_internlm2.py#L1138)来了解对话历史的拼接。
目前InternLM2-Chat 模型的词表中还维护了如下映射
@ -131,7 +131,7 @@ fig.show()
````
1. 首先在系统提示中提供代码解释器的格式和字段描述。内容以 `<|im_start|>system name=<|interpreter|>\n`开头,`<|im_end|>` 结尾,`name=<|interpreter|>` 体现了这是来自代码解释器的指令。InternLM2-Chat 支持 system 角色对模型的提示和约束多次出现。所以我们会看到前面还有关于对话的要求。
2. 用户可以上传一个文件,并对模型提出要求,文件的上传会以单独的形式向模型发出一条指令,以 `<|im_start|>user name=file` 开头,以 json 形式给出路径和文件大小` [{"path": "data.csv", size='10K'}]`,以 `<|im_end|>`结尾。
2. 用户可以上传一个文件,并对模型提出要求,文件的上传会以单独的形式向模型发出一条指令,以 `<|im_start|>user name=file` 开头,以 json 形式给出路径和文件大小`[{"path": "data.csv", size='10K'}]`,以 `<|im_end|>`结尾。
3. 模型在接受到用户指令后,会以流式的形式调用工具,及自然地生成文字进行思考/回应用户,然后输出`<|action_start|><|interpreter|>`。`<|action_start|>`表示要调用外部插件,同时 `<|interpreter|>` 表示调用的是代码解释器。然后模型输出 markdown 中 python 代码块格式代码内容,再以 `<|action_end|>` 表示工具调用结束。
4. 系统会执行代码块中的代码,然后返回调用结果,以 `<|im_start|>environment name=<|interpreter|>`开头,表示是来自环境关于代码解释器执行的输出,以`<|im_end|>`结尾。

View File

@ -20,7 +20,7 @@ With just 4 lines of codes, you can execute batch inference using a list of prom
```python
from lmdeploy import pipeline
pipe = pipeline("internlm/internlm2-chat-7b")
pipe = pipeline("internlm/internlm2_5-7b-chat")
response = pipe(["Hi, pls intro yourself", "Shanghai is"])
print(response)
```
@ -31,7 +31,7 @@ With dynamic ntk, LMDeploy can handle a context length of 200K for `InternLM2`:
from lmdeploy import pipeline, TurbomindEngineConfig
engine_config = TurbomindEngineConfig(session_len=200000,
rope_scaling_factor=2.0)
pipe = pipeline("internlm/internlm2-chat-7b", backend_engine=engine_config)
pipe = pipeline("internlm/internlm2_5-7b-chat", backend_engine=engine_config)
gen_config = GenerationConfig(top_p=0.8,
top_k=40,
temperature=0.8,
@ -47,7 +47,7 @@ For more information about LMDeploy pipeline usage, please refer to [here](https
LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
```shell
lmdeploy serve api_server internlm/internlm2-chat-7b
lmdeploy serve api_server internlm/internlm2_5-7b-chat
```
The default port of `api_server` is `23333`. After the server is launched, you can communicate with server on terminal through `api_client`:

View File

@ -20,7 +20,7 @@ pip install lmdeploy>=0.2.1
```python
from lmdeploy import pipeline
pipe = pipeline("internlm/internlm2-chat-7b")
pipe = pipeline("internlm/internlm2_5-7b-chat")
response = pipe(["Hi, pls intro yourself", "Shanghai is"])
print(response)
```
@ -31,7 +31,7 @@ LMDeploy 实现了 dynamic ntk支持长文本外推。使用如下代码
from lmdeploy import pipeline, TurbomindEngineConfig
engine_config = TurbomindEngineConfig(session_len=200000,
rope_scaling_factor=2.0)
pipe = pipeline("internlm/internlm2-chat-7b", backend_engine=engine_config)
pipe = pipeline("internlm/internlm2_5-7b-chat", backend_engine=engine_config)
gen_config = GenerationConfig(top_p=0.8,
top_k=40,
temperature=0.8,
@ -47,7 +47,7 @@ print(response)
LMDeploy `api_server` 支持把模型一键封装为服务,对外提供的 RESTful API 兼容 openai 的接口。以下为服务启动的示例:
```shell
lmdeploy serve api_server internlm/internlm2-chat-7b
lmdeploy serve api_server internlm/internlm2_5-7b-chat
```
服务默认端口是23333。在 server 启动后,你可以在终端通过`api_client`与server进行对话体验对话效果

View File

@ -6,7 +6,7 @@ English | [简体中文](openaoe_zh_cn.md)
[OpenAOE](https://github.com/InternLM/OpenAOE) is a LLM-Group-Chat Framework, which can chat with multiple LLMs (commercial/open source LLMs) at the same time. OpenAOE provides both backend API and WEB-UI to meet different usage needs.
Currently already supported LLMs: [InternLM2-Chat-7B](https://huggingface.co/internlm/internlm2-chat-7b), [IntenLM-Chat-7B](https://huggingface.co/internlm/internlm-chat-7b), GPT-3.5, GPT-4, Google PaLM, MiniMax, Claude, Spark, etc.
Currently already supported LLMs: [internlm2_5-7b-chat](https://huggingface.co/internlm/internlm2_5-7b-chat), [IntenLM-Chat-7B](https://huggingface.co/internlm/internlm-chat-7b), GPT-3.5, GPT-4, Google PaLM, MiniMax, Claude, Spark, etc.
## Quick Run

View File

@ -6,7 +6,7 @@
[OpenAOE](https://github.com/InternLM/OpenAOE) 是一个 LLM-Group-Chat 框架,可以同时与多个商业大模型或开源大模型进行聊天。 OpenAOE还提供后端API和WEB-UI以满足不同的使用需求。
目前已经支持的大模型有: [InternLM2-Chat-7B](https://huggingface.co/internlm/internlm2-chat-7b), [IntenLM-Chat-7B](https://huggingface.co/internlm/internlm-chat-7b), GPT-3.5, GPT-4, Google PaLM, MiniMax, Claude, 讯飞星火等。
目前已经支持的大模型有: [internlm2_5-7b-chat](https://huggingface.co/internlm/internlm2_5-7b-chat), [IntenLM-Chat-7B](https://huggingface.co/internlm/internlm-chat-7b), GPT-3.5, GPT-4, Google PaLM, MiniMax, Claude, 讯飞星火等。
## 快速安装

View File

@ -180,10 +180,10 @@ def on_btn_click():
@st.cache_resource
def load_model():
model = (AutoModelForCausalLM.from_pretrained('internlm/internlm2-chat-7b',
model = (AutoModelForCausalLM.from_pretrained('internlm/internlm2_5-7b-chat',
trust_remote_code=True).to(
torch.bfloat16).cuda())
tokenizer = AutoTokenizer.from_pretrained('internlm/internlm2-chat-7b',
tokenizer = AutoTokenizer.from_pretrained('internlm/internlm2_5-7b-chat',
trust_remote_code=True)
return model, tokenizer
@ -239,7 +239,7 @@ def main():
user_avator = 'assets/user.png'
robot_avator = 'assets/robot.png'
st.title('InternLM2-Chat-7B')
st.title('internlm2_5-7b-chat')
generation_config = prepare_generation_config()

View File

@ -55,7 +55,7 @@ XTuner supports the efficient fine-tune (*e.g.*, QLoRA) for InternLM2.
xtuner train ${CONFIG_NAME_OR_PATH}
```
For example, we can start the QLoRA fine-tuning of InternLM2-Chat-7B with oasst1 dataset by
For example, we can start the QLoRA fine-tuning of internlm2_5-7b-chat with oasst1 dataset by
```shell
# On a single GPU
@ -83,16 +83,16 @@ xtuner chat ${NAME_OR_PATH_TO_LLM} [optional arguments]
For example, we can start the chat with
InternLM2-Chat-7B with adapter trained from oasst1:
internlm2_5-7b-chat with adapter trained from oasst1:
```shell
xtuner chat internlm/internlm2-chat-7b --adapter xtuner/internlm2-chat-7b-qlora-oasst1 --prompt-template internlm2_chat
xtuner chat internlm/internlm2_5-7b-chat --adapter xtuner/internlm2_5-7b-chat-qlora-oasst1 --prompt-template internlm2_chat
```
LLaVA-InternLM2-7B:
```shell
xtuner chat internlm/internlm2-chat-7b --visual-encoder openai/clip-vit-large-patch14-336 --llava xtuner/llava-internlm2-7b --prompt-template internlm2_chat --image $IMAGE_PATH
xtuner chat internlm/internlm2_5-7b-chat --visual-encoder openai/clip-vit-large-patch14-336 --llava xtuner/llava-internlm2-7b --prompt-template internlm2_chat --image $IMAGE_PATH
```
## InternEvo

View File

@ -53,7 +53,7 @@
xtuner train ${CONFIG_NAME_OR_PATH}
```
例如,我们可以利用 QLoRA 算法在 oasst1 数据集上微调 InternLM2-Chat-7B
例如,我们可以利用 QLoRA 算法在 oasst1 数据集上微调 internlm2_5-7b-chat
```shell
# 单卡
@ -81,16 +81,16 @@ xtuner chat ${NAME_OR_PATH_TO_LLM} [optional arguments]
例如:
InternLM2-Chat-7B, oasst1 adapter 对话:
internlm2_5-7b-chat, oasst1 adapter 对话:
```shell
xtuner chat internlm/internlm2-chat-7b --adapter xtuner/internlm2-chat-7b-qlora-oasst1 --prompt-template internlm2_chat
xtuner chat internlm/internlm2_5-7b-chat --adapter xtuner/internlm2_5-7b-chat-qlora-oasst1 --prompt-template internlm2_chat
```
与 LLaVA-InternLM2-7B 对话:
```shell
xtuner chat internlm/internlm2-chat-7b --visual-encoder openai/clip-vit-large-patch14-336 --llava xtuner/llava-internlm2-7b --prompt-template internlm2_chat --image $IMAGE_PATH
xtuner chat internlm/internlm2_5-7b-chat --visual-encoder openai/clip-vit-large-patch14-336 --llava xtuner/llava-internlm2-7b --prompt-template internlm2_chat --image $IMAGE_PATH
```
## InternEvo

View File

@ -21,6 +21,7 @@ class TestChat:
@pytest.mark.parametrize(
'model_name',
[
'internlm/internlm2_5-7b-chat',
'internlm/internlm2-chat-7b', 'internlm/internlm2-chat-7b-sft',
'internlm/internlm2-chat-20b', 'internlm/internlm2-chat-20b-sft',
'internlm/internlm2-chat-1_8b', 'internlm/internlm2-chat-1_8b-sft'