doc(readme): update readme, add 20B releasing info (#328)

* fix(eval): StreamingDataset does not have an __len__ method.

* doc(readme): update readme

* update readme
pull/329/head
Shuo Zhang 2023-09-20 16:04:43 +08:00 committed by GitHub
parent bfefc4ea3c
commit 2a09ebd5c1
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 246 additions and 105 deletions

View File

@ -33,26 +33,104 @@
</div>
<p align="center">
👋 加入我们的<a href="https://twitter.com/intern_lm" target="_blank">推特</a><a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a><a href="https://r.vansin.top/?r=internwx" target="_blank">微信社区</a>
👋 加入我们的 <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a><a href="https://github.com/InternLM/InternLM/assets/25839884/a6aad896-7232-4220-ac84-9e070c2633ce" target="_blank">微信社区</a>
</p>
## 简介
InternLM 即书生·浦语大模型包含面向实用场景的70亿参数基础模型与对话模型 InternLM-7B。模型具有以下特点
InternLM 是一个开源的轻量级训练框架,旨在支持大模型训练而无需大量的依赖。通过单一的代码库,它支持在拥有数千个 GPU 的大型集群上进行预训练,并在单个 GPU 上进行微调同时实现了卓越的性能优化。在1024个 GPU 上训练时InternLM 可以实现近90%的加速效率。
- 使用上万亿高质量语料,建立模型超强知识体系;
- 支持8k语境窗口长度实现更长输入与更强推理体验
- 通用工具调用能力,支持用户灵活自助搭建流程;
基于InternLM训练框架我们已经预训练了两个开源的预训练模型InternLM-7B 和 InternLM-20B。
提供了支持模型预训练的轻量级训练框架无需安装大量依赖包一套代码支持千卡预训练和单卡人类偏好对齐训练同时实现了极致的性能优化实现千卡训练下近90%加速效率。
## 更新
## 新闻
[20230920] InternLM-20B 已发布,包括基础版和对话版。
[20230822] InternLM-7B-Chat v1.1 已发布,增加了代码解释器和函数调用能力。您可以使用 [Lagent](https://github.com/InternLM/lagent) 进行尝试。
我们开源了 InternLM-Chat-7B v1.1。该模型能够调用代码解释器和工具插件。你可以在 [Lagent](https://github.com/InternLM/lagent) 中体验这些新功能。
## InternLM-7B
## Model Zoo
### 性能评测
我们的模型在三个平台上发布Transformers、ModelScope 和 OpenXLab。
| Model | Transformers | ModelScope | OpenXLab | | 发布日期 |
|---------------------------|------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|---|--------------|
| **InternLM Chat 20B** | [🤗internlm/internlm-chat-20b](https://huggingface.co/internlm/internlm-20b-chat) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b-chat/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-20b) | | 2023-09-20 |
| **InternLM 20B** | [🤗internlm/internlm-20b](https://huggingface.co/internlm/internlm-20b) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-20b) | | 2023-09-20 |
| **InternLM Chat 7B v1.1** | [🤗internlm/internlm-chat-7b-v1.1](https://huggingface.co/internlm/internlm-chat-7b-v1.1) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-7b-v1_1](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b-v1_1/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-v1.1) | | 2023-08-22 |
| **InternLM 7B** | [🤗internlm/internlm-7b](https://huggingface.co/internlm/internlm-7b) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-7b) | | 2023-07-06 |
| **InternLM Chat 7B** | [🤗internlm/internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b) | | 2023-07-06 |
| **InternLM Chat 7B 8k** | [🤗internlm/internlm-chat-7b-8k](https://huggingface.co/internlm/internlm-chat-7b-8k) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-7b-8k](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b-8k/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-8k) | | 2023-07-06 |
<details>
<summary> InternLM-20B </summary>
### 简介
InternLM-20B 在超过 **2.3T** Tokens 包含高质量英文、中文和代码的数据上进行预训练,其中 Chat 版本还经过了 SFT 和 RLHF 训练,使其能够更好、更安全地满足用户的需求。
InternLM 20B 在模型结构上选择了深结构InternLM-20B 的层数设定为60层超过常规7B和13B模型所使用的32层或者40层。在参数受限的情况下提高层数有利于提高模型的综合能力。此外相较于InternLM-7BInternLM-20B使用的预训练数据经过了更高质量的清洗并补充了高知识密度和用于强化理解和推理能力的训练数据。因此它在理解能力、推理能力、数学能力、编程能力等考验语言模型技术水平的方面都得到了显著提升。总体而言InternLM-20B具有以下的特点
- 优异的综合性能
- 很强的工具调用功能
- 支持16k语境长度通过推理时外推
- 更好的价值对齐
### 性能对比
在OpenCompass提出的5个能力维度上InternLM-20B都取得很好的效果粗体为13B-33B这个量级范围内各项最佳成绩
| 能力维度 | Llama-13B | Llama2-13B | Baichuan2-13B | InternLM-20B | Llama-33B | Llama-65B | Llama2-70B |
|----------|-----------|------------|---------------|--------------|-----------|-----------|------------|
| 语言 | 42.5 | 47 | 47.5 | **55** | 44.6 | 47.1 | 51.6 |
| 知识 | 58.2 | 58.3 | 48.9 | 60.1 | **64** | 66 | 67.7 |
| 理解 | 45.5 | 50.9 | 58.1 | **67.3** | 50.6 | 54.2 | 60.8 |
| 推理 | 42.7 | 43.6 | 44.2 | **54.9** | 46.4 | 49.8 | 55 |
| 学科 | 37.3 | 45.2 | 51.8 | **62.5** | 47.4 | 49.7 | 57.3 |
| 总平均 | 43.8 | 47.3 | 49.4 | **59.2** | 48.9 | 51.9 | 57.4 |
下表在一些有重要影响力的典型数据集上比较了主流开源模型的表现
| | 评测集 | Llama-13B | Llama2-13B | Baichuan2-13B | InternLM-20B | Llama-33B | Llama-65B | Llama2-70B |
|------|------------------|-----------|------------|---------------|--------------|-----------|-----------|------------|
| 学科 | MMLU | 47.73 | 54.99 | 59.55 | **62.05** | 58.73 | 63.71 | 69.75 |
| | C-Eval (val) | 31.83 | 41.4 | **59.01** | 58.8 | 37.47 | 40.36 | 50.13 |
| | AGI-Eval | 22.03 | 30.93 | 37.37 | **44.58** | 33.53 | 33.92 | 40.02 |
| 知识 | BoolQ | 78.75 | 82.42 | 67 | **87.46** | 84.43 | 86.61 | 87.74 |
| | TriviaQA | 52.47 | 59.36 | 46.61 | 57.26 | **66.24** | 69.79 | 70.71 |
| | NaturalQuestions | 20.17 | 24.85 | 16.32 | 25.15 | **30.89** | 33.41 | 34.16 |
| 理解 | CMRC | 9.26 | 31.59 | 29.85 | **68.78** | 14.17 | 34.73 | 43.74 |
| | CSL | 55 | 58.75 | 63.12 | **65.62** | 57.5 | 59.38 | 60 |
| | RACE (middle) | 53.41 | 63.02 | 68.94 | **86.35** | 64.55 | 72.35 | 81.55 |
| | RACE (high) | 47.63 | 58.86 | 67.18 | **83.28** | 62.61 | 68.01 | 79.93 |
| | XSum | 20.37 | 23.37 | 25.23 | **35.54** | 20.55 | 19.91 | 25.38 |
| 推理 | WinoGrande | 64.64 | 64.01 | 67.32 | **69.38** | 66.85 | 69.38 | 69.77 |
| | BBH | 37.93 | 45.62 | 48.98 | **52.51** | 49.98 | 58.38 | 64.91 |
| | GSM8K | 20.32 | 29.57 | **52.62** | **52.62** | 42.3 | 54.44 | 63.31 |
| | PIQA | 79.71 | 79.76 | 78.07 | 80.25 | **81.34** | 82.15 | 82.54 |
| 编程 | HumanEval | 14.02 | 18.9 | 17.07 | **25.61** | 17.68 | 18.9 | 26.22 |
| | MBPP | 20.6 | 26.8 | 30.8 | **35.6** | 28.4 | 33.6 | 39.6 |
总体而言InternLM-20B 在综合能力上全面领先于13B量级的开源模型同时在推理评测集上接近甚至超越Llama-65B的性能。
- 评估结果来自 [OpenCompass 20230920](https://github.com/internLM/OpenCompass/)。
- 由于 [OpenCompass](https://github.com/internLM/OpenCompass/) 的版本迭代,评估数据可能存在数值上的差异,所以请参考 [OpenCompass](https://github.com/internLM/OpenCompass/) 的最新评估结果。
</details>
<details>
<summary> InternLM-7B </summary>
#### 模型更新
[20230822] 通过使用更丰富的SFT类型数据InternLM-7B-Chat v1.1模型支持代码解释和函数调用。模型结构与代码没有任何变化因此可以使用与InternLM-7B-Chat完全一样的方式使用更强大的InternLM-7B-Chat v1.1。
#### 简介
InternLM-7B 包含了一个拥有70亿参数的基础模型和一个为实际场景量身定制的对话模型。该模型具有以下特点
- 它利用数万亿的高质量令牌进行训练,建立了一个强大的知识库。
- 它支持8k的上下文窗口长度使得输入序列更长并增强了推理能力。
- 它为用户提供了一个多功能的工具集,使用户能够灵活地构建自己的工作流程。
#### 性能对比
我们使用开源评测工具 [OpenCompass](https://github.com/internLM/OpenCompass/) 从学科综合能力、语言能力、知识能力、推理能力、理解能力五大能力维度对InternLM开展全面评测部分评测结果如下表所示欢迎访问[OpenCompass 榜单](https://opencompass.org.cn/rank)获取更多的评测结果。
@ -72,27 +150,22 @@ InternLM 即书生·浦语大模型包含面向实用场景的70亿参数
- 以上评测结果基于 [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) 获得(部分数据标注`*`代表数据来自原始论文),具体测试细节可参见 [OpenCompass](https://github.com/internLM/OpenCompass/) 中提供的配置文件。
- 评测数据会因 [OpenCompass](https://github.com/internLM/OpenCompass/) 的版本迭代而存在数值差异,请以 [OpenCompass](https://github.com/internLM/OpenCompass/) 最新版的评测结果为主。
### Model Zoo
当前通过 InternLM 训练的 InternLM 7B 和 InternLM 7B Chat 已经开源,我们提供两种格式的模型权重以供使用。除了使用 Transformers 格式加载模型之外,还可以通过 InternLM 加载以下格式的权重直接进行继续预训练或人类偏好对齐训练
| 模型 | InternLM 格式权重下载地址 | Transformers 格式权重下载地址 |
| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ |
| **InternLM 7B** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-7b) | [🤗internlm/intern-7b](https://huggingface.co/internlm/internlm-7b) |
| **InternLM Chat 7B v1.1** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-v1.1) | [🤗internlm/intern-chat-7b-v1.1](https://huggingface.co/internlm/internlm-chat-7b-v1.1) |
| **InternLM Chat 7B** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b) | [🤗internlm/intern-chat-7b](https://huggingface.co/internlm/internlm-chat-7b)
| **InternLM Chat 7B 8k** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-8k) | [🤗internlm/intern-chat-7b-8k](https://huggingface.co/internlm/internlm-chat-7b-8k)
**局限性:** 尽管在训练过程中我们非常注重模型的安全性,尽力促使模型输出符合伦理和法律要求的文本,但受限于模型大小以及概率生成范式,模型可能会产生各种不符合预期的输出,例如回复内容包含偏见、歧视等有害内容,请勿传播这些内容。由于传播不良信息导致的任何后果,本项目不承担责任。
</details>
## 使用案例
### 通过 Transformers 加载
通过以下的代码加载 InternLM 7B Chat 模型
通过以下的代码从 Transformers 加载 InternLM 模型 (可修改模型名称替换不同的模型)
```python
>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b-v1_1", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b-v1_1", trust_remote_code=True).cuda()
>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True).cuda()
>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "你好", history=[])
>>> print(response)
@ -105,6 +178,24 @@ InternLM 即书生·浦语大模型包含面向实用场景的70亿参数
3. 集中注意力:避免分心,集中注意力完成任务。关闭社交媒体和电子邮件通知,专注于任务,这将帮助您更快地完成任务,并减少错误的可能性。
```
### 通过 ModelScope 加载
通过以下的代码从 ModelScope 加载 InternLM 模型 (可修改模型名称替换不同的模型)
```python
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
import torch
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm-chat-7b-v1_1', revision='v1.0.0')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
model = AutoModelForCausalLM.from_pretrained(model_dir,device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
model = model.eval()
response, history = model.chat(tokenizer, "hello", history=[])
print(response)
response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
print(response)
```
### 通过前端网页对话
可以通过以下代码启动一个前端的界面来与 InternLM Chat 7B 模型进行交互
@ -123,44 +214,25 @@ streamlit run web_demo.py
我们使用 [LMDeploy](https://github.com/InternLM/LMDeploy) 完成 InternLM 的一键部署。
```bash
python3 -m pip install lmdeploy
```
1. 首先安装 LMDeploy:
执行以下命令,可以在终端与 `internlm-chat-7b` 模型进行交互式对话,或者通过 WebUI 与它聊天。
```
python3 -m pip install lmdeploy
```
```bash
# 转换权重格式
python3 -m lmdeploy.serve.turbomind.deploy internlm-chat-7b
2. 快速的部署命令如下:
# 在终端进行交互式对话
python3 -m lmdeploy.turbomind.chat ./workspace
```
python3 -m lmdeploy.serve.turbomind.deploy InternLM-7B /path/to/internlm-7b/model hf
```
# 启动 gradio 服务
python3 -m lmdeploy.serve.gradio.app ./workspace
```
以上过程中LMDeploy 使用的是 FP16 的计算精度。
3. 在导出模型后,你可以直接通过如下命令启动服务一个服务并和部署后的模型对话
除了 FP16 精度LMDeploy 还支持 `internlm-chat-7b` 4bit 权重模型推理。它不仅把模型的显存减少到 6G大约只有 FP16 的 40%,更重要的是,经过 kernel 层面的极致优化,其推理性能在 A100-80G 上可达到 FP16 的 2.4 倍以上。
以下是`internlm-chat-7b` 4bit 权重模型的部署方法。推理速度的 bechmark 请参考[这里](https://github.com/InternLM/lmdeploy/blob/main/docs/zh_cn/w4a16.md#%E6%8E%A8%E7%90%86%E9%80%9F%E5%BA%A6)
```bash
# download prequnantized internlm-chat-7b model from huggingface
git-lfs install
git clone https://huggingface.co/lmdeploy/llama2-chat-7b-w4
# Convert the model's layout and store it in the default path, ./workspace.
python3 -m lmdeploy.serve.turbomind.deploy internlm-chat-7b ./llama2-chat-7b-w4 awq --group-size 128
# inference lmdeploy's turbomind engine
python3 -m lmdeploy.turbomind.chat ./workspace
# serving with gradio
python3 -m lmdeploy.serve.gradio.app ./workspace
```
LMDeploy 是涵盖了 LLM 任务的全套轻量化、部署和服务的工具箱。请参考 [部署教程](https://github.com/InternLM/LMDeploy) 了解 InternLM 的更多部署细节。
```
python3 -m lmdeploy.serve.client {server_ip_addresss}:33337
```
[LMDeploy](https://github.com/InternLM/LMDeploy) 支持了 InternLM 部署的完整流程,请参考 [部署教程](https://github.com/InternLM/LMDeploy) 了解 InternLM 的更多部署细节。
## 微调&训练

171
README.md
View File

@ -33,26 +33,103 @@
</div>
<p align="center">
👋 join us on <a href="https://twitter.com/intern_lm" target="_blank">Twitter</a>, <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a> and <a href="https://r.vansin.top/?r=internwx" target="_blank">WeChat</a>
👋 join us on <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a> and <a href="https://github.com/InternLM/InternLM/assets/25839884/a6aad896-7232-4220-ac84-9e070c2633ce" target="_blank">WeChat</a>
</p>
## Introduction
InternLM is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies. With a single codebase, it supports pre-training on large-scale clusters with thousands of GPUs, and fine-tuning on a single GPU while achieving remarkable performance optimizations. InternLM achieves nearly 90% acceleration efficiency during training on 1024 GPUs.
InternLM has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics:
Based on the InternLM training framework, we have pre-trained two open-sourced pretrained model InternLM-7B and InternLM-20B.
## News
[20230920] InternLM-20B is released with base and chat versions.
[20230822] InternLM-7B-Chat v1.1 is released with code interpreter and function calling capability. You can try it with [Lagent](https://github.com/InternLM/lagent).
## Model Zoo
Our models are released in three platforms: Transformers, ModelScope and OpenXLab.
| Model | Transformers | ModelScope | OpenXLab | | Release Date |
|---------------------------|------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|---|--------------|
| **InternLM Chat 20B** | [🤗internlm/internlm-chat-20b](https://huggingface.co/internlm/internlm-20b-chat) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b-chat/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-20b) | | 2023-09-20 |
| **InternLM 20B** | [🤗internlm/internlm-20b](https://huggingface.co/internlm/internlm-20b) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-20b) | | 2023-09-20 |
| **InternLM Chat 7B v1.1** | [🤗internlm/internlm-chat-7b-v1.1](https://huggingface.co/internlm/internlm-chat-7b-v1.1) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-7b-v1_1](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b-v1_1/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-v1.1) | | 2023-08-22 |
| **InternLM 7B** | [🤗internlm/internlm-7b](https://huggingface.co/internlm/internlm-7b) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-7b) | | 2023-07-06 |
| **InternLM Chat 7B** | [🤗internlm/internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b) | | 2023-07-06 |
| **InternLM Chat 7B 8k** | [🤗internlm/internlm-chat-7b-8k](https://huggingface.co/internlm/internlm-chat-7b-8k) | [<img src="./doc/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-7b-8k](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b-8k/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-8k) | | 2023-07-06 |
<details>
<summary> InternLM-20B </summary>
### Introduction
InternLM-20B was pre-trained on over **2.3T** Tokens containing high-quality English, Chinese, and code data. Additionally, the Chat version has undergone SFT and RLHF training, enabling it to better and more securely meet users' needs.
In terms of model structure, InternLM-20B opted for a deeper architecture, with a depth set at 60 layers. This surpasses the conventional 7B and 13B models that utilize 32 or 40 layers. When parameters are limited, increasing the number of layers can enhance the model's overall capability. Furthermore, compared to InternLM-7B, the pre-training data used for InternLM-20B underwent higher quality cleansing and was supplemented with data rich in knowledge and designed for reinforcing understanding and reasoning capabilities. As a result, it exhibits significant improvements in understanding, reasoning, mathematical, and programming abilities—all of which test the technical proficiency of language models. Overall, InternLM-20B features the following characteristics:
- Outstanding overall performance
- Strong utility invocation capability
- Supports a 16k context length (Through inference extrapolation)
- Better value alignment.
### Performace Evaluation
On the 5 capability dimensions proposed by OpenCompass, InternLM-20B has achieved excellent results (the bolded scores represent the best performances within the 13B-33B parameter range).
| Capability | Llama-13B | Llama2-13B | Baichuan2-13B | InternLM-20B | Llama-33B | Llama-65B | Llama2-70B |
|----------|-----------|------------|---------------|--------------|-----------|-----------|------------|
| Language | 42.5 | 47 | 47.5 | **55** | 44.6 | 47.1 | 51.6 |
| Knowledge | 58.2 | 58.3 | 48.9 | 60.1 | **64** | 66 | 67.7 |
| Understanding | 45.5 | 50.9 | 58.1 | **67.3** | 50.6 | 54.2 | 60.8 |
| Reasoning | 42.7 | 43.6 | 44.2 | **54.9** | 46.4 | 49.8 | 55 |
| Examination | 37.3 | 45.2 | 51.8 | **62.5** | 47.4 | 49.7 | 57.3 |
| Overall | 43.8 | 47.3 | 49.4 | **59.2** | 48.9 | 51.9 | 57.4 |
The table below compares the performance of mainstream open-source models on some influential and typical datasets.
| | Benchmarks | Llama-13B | Llama2-13B | Baichuan2-13B | InternLM-20B | Llama-33B | Llama-65B | Llama2-70B |
|------|------------------|-----------|------------|---------------|--------------|-----------|-----------|------------|
| Examination | MMLU | 47.73 | 54.99 | 59.55 | **62.05** | 58.73 | 63.71 | 69.75 |
| | C-Eval (val) | 31.83 | 41.4 | **59.01** | 58.8 | 37.47 | 40.36 | 50.13 |
| | AGI-Eval | 22.03 | 30.93 | 37.37 | **44.58** | 33.53 | 33.92 | 40.02 |
| Knowledge | BoolQ | 78.75 | 82.42 | 67 | **87.46** | 84.43 | 86.61 | 87.74 |
| | TriviaQA | 52.47 | 59.36 | 46.61 | 57.26 | **66.24** | 69.79 | 70.71 |
| | NaturalQuestions | 20.17 | 24.85 | 16.32 | 25.15 | **30.89** | 33.41 | 34.16 |
| Understanding | CMRC | 9.26 | 31.59 | 29.85 | **68.78** | 14.17 | 34.73 | 43.74 |
| | CSL | 55 | 58.75 | 63.12 | **65.62** | 57.5 | 59.38 | 60 |
| | RACE (middle) | 53.41 | 63.02 | 68.94 | **86.35** | 64.55 | 72.35 | 81.55 |
| | RACE (high) | 47.63 | 58.86 | 67.18 | **83.28** | 62.61 | 68.01 | 79.93 |
| | XSum | 20.37 | 23.37 | 25.23 | **35.54** | 20.55 | 19.91 | 25.38 |
| Reasoning | WinoGrande | 64.64 | 64.01 | 67.32 | **69.38** | 66.85 | 69.38 | 69.77 |
| | BBH | 37.93 | 45.62 | 48.98 | **52.51** | 49.98 | 58.38 | 64.91 |
| | GSM8K | 20.32 | 29.57 | **52.62** | **52.62** | 42.3 | 54.44 | 63.31 |
| | PIQA | 79.71 | 79.76 | 78.07 | 80.25 | **81.34** | 82.15 | 82.54 |
| Programming | HumanEval | 14.02 | 18.9 | 17.07 | **25.61** | 17.68 | 18.9 | 26.22 |
| | MBPP | 20.6 | 26.8 | 30.8 | **35.6** | 28.4 | 33.6 | 39.6 |
Overall, InternLM-20B comprehensively outperforms open-source models in the 13B parameter range in terms of overall capabilities, and on inference evaluation sets, it approaches or even surpasses the performance of Llama-65B.
- The evaluation results were obtained from [OpenCompass 20230920](https://github.com/internLM/OpenCompass/).
- The evaluation data may have numerical differences due to the version iteration of [OpenCompass](https://github.com/internLM/OpenCompass/), so please refer to the latest evaluation results of [OpenCompass](https://github.com/internLM/OpenCompass/).
</details>
<details>
<summary> InternLM-7B </summary>
#### News
[20230822] By utilizing richer SFT-type data, the InternLM-7B-Chat v1.1 model supports code interpretation and function invocation. The model structure and code remain unchanged, so the more powerful InternLM-7B-Chat v1.1 can be used in exactly the same way as InternLM-7B-Chat.
#### Introduction
InternLM-7B contains a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics:
- It leverages trillions of high-quality tokens for training to establish a powerful knowledge base.
- It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities.
- It provides a versatile toolset for users to flexibly build their own workflows.
Additionally, a lightweight training framework is offered to support model pre-training without the need for extensive dependencies. With a single codebase, it supports pre-training on large-scale clusters with thousands of GPUs, and fine-tuning on a single GPU while achieving remarkable performance optimizations. InternLM achieves nearly 90% acceleration efficiency during training on 1024 GPUs.
## News
InternLM-7B-Chat v1.1 is released with code interpreter and function calling capability. You can try it with [Lagent](https://github.com/InternLM/lagent).
## InternLM-7B
### Performance Evaluation
#### Performance Evaluation
We conducted a comprehensive evaluation of InternLM using the open-source evaluation tool [OpenCompass](https://github.com/internLM/OpenCompass/). The evaluation covered five dimensions of capabilities: disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence. Here are some of the evaluation results, and you can visit the [OpenCompass leaderboard](https://opencompass.org.cn/rank) for more evaluation results.
@ -72,19 +149,12 @@ We conducted a comprehensive evaluation of InternLM using the open-source evalua
- The evaluation results were obtained from [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) (some data marked with *, which means come from the original papers), and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/internLM/OpenCompass/).
- The evaluation data may have numerical differences due to the version iteration of [OpenCompass](https://github.com/internLM/OpenCompass/), so please refer to the latest evaluation results of [OpenCompass](https://github.com/internLM/OpenCompass/).
### Model Zoo
InternLM 7B and InternLM 7B Chat, trained using InternLM, have been open-sourced. We provide two formats of model weights for use. In addition to loading the models using the Transformers format, you can also load the weights directly using InternLM for further pre-training or human preference alignment training.
| Model | InternLM Format Weight Download Link | Transformers Format Weight Download Link |
| ----------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
| **InternLM 7B** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-7b) | [🤗internlm/intern-7b](https://huggingface.co/internlm/internlm-7b) |
| **InternLM Chat 7B v1.1** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-v1.1) | [🤗internlm/intern-chat-7b-v1.1](https://huggingface.co/internlm/internlm-chat-7b-v1.1) |
| **InternLM Chat 7B** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b) | [🤗internlm/intern-chat-7b](https://huggingface.co/internlm/internlm-chat-7b) |
| **InternLM Chat 7B 8k** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-8k) | [🤗internlm/intern-chat-7b-8k](https://huggingface.co/internlm/internlm-chat-7b-8k) |
</details>
**Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.
## Usage Examples
### Import from Transformers
To load the InternLM 7B Chat model using Transformers, use the following code:
@ -108,6 +178,23 @@ Sure, here are three tips for effective time management:
Remember, good time management skills take practice and patience. Start with small steps and gradually incorporate these habits into your daily routine.
```
### Import from ModelScope
To load the InternLM model using ModelScope, use the following code:
```python
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
import torch
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm-chat-7b-v1_1', revision='v1.0.0')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
model = AutoModelForCausalLM.from_pretrained(model_dir,device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
model = model.eval()
response, history = model.chat(tokenizer, "hello", history=[])
print(response)
response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
print(response)
```
### Dialogue
You can interact with the InternLM Chat 7B model through a frontend interface by running the following code:
@ -124,45 +211,27 @@ The effect is as follows
### Deployment
We use [LMDeploy](https://github.com/InternLM/LMDeploy) to complete the workflow of InternLM deployment.
We use [LMDeploy](https://github.com/InternLM/LMDeploy) to complete the one-click deployment of InternLM.
```bash
python3 -m pip install lmdeploy
1. First, install LMDeploy:
```
python3 -m pip install lmdeploy
```
You can utilize the following commands to conduct `internlm-chat-7b` FP16 inference, serve it and interact with AI assistant via WebUI:
2. Use the following command for quick deployment:
```bash
# convert weight layout
python3 -m lmdeploy.serve.turbomind.deploy internlm-chat-7b
# inference lmdeploy's turbomind engine
python3 -m lmdeploy.turbomind.chat ./workspace
# serving with gradio
python3 -m lmdeploy.serve.gradio.app ./workspace
```
python3 -m lmdeploy.serve.turbomind.deploy InternLM-7B /path/to/internlm-7b/model hf
```
You can also deploy 4-bit quantized `internlm-chat-7b` model via LMDeploy. It greatly trims down the model's memory overhead to 6G, just 40% of what FP16 inference would take. More importantly, with extreme optimized kernel, the inference performance achieves 2.4x faster than FP16 inference on A100-80G.
3. After exporting the model, you can start a server and have a conversation with the deployed model using the following command:
Try the followings to enjoy 4-bit `internlm-chat-7b` on a Geforce RTX 30x GPU card. You can find the inference benchmark from [here](https://github.com/InternLM/lmdeploy/blob/main/docs/en/w4a16.md#inference-performance).
```bash
# download prequnantized internlm-chat-7b model from huggingface
git-lfs install
git clone https://huggingface.co/lmdeploy/llama2-chat-7b-w4
# Convert the model's layout and store it in the default path, ./workspace.
python3 -m lmdeploy.serve.turbomind.deploy internlm-chat-7b ./llama2-chat-7b-w4 awq --group-size 128
# inference lmdeploy's turbomind engine
python3 -m lmdeploy.turbomind.chat ./workspace
# serving with gradio
python3 -m lmdeploy.serve.gradio.app ./workspace
```
python3 -m lmdeploy.serve.client {server_ip_addresss}:33337
```
LMDeploy is an efficient toolkit for compressing, deploying, and serving LLM models. Please refer to the [deployment tutorial](https://github.com/InternLM/LMDeploy) for more details on deploying InternLM.
[LMDeploy](https://github.com/InternLM/LMDeploy) provides a complete workflow for deploying InternLM. Please refer to the [deployment tutorial](https://github.com/InternLM/LMDeploy) for more details on deploying InternLM.
## Fine-tuning & Training

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.0 KiB