[Doc]: fix citation blocks (#32)

2023-07-08 00:17:41 +08:00 · 2023-07-08 00:17:41 +08:00 · c690bf3779
parent 2066b36693
commit c690bf3779
2 changed files with 78 additions and 56 deletions
--- a/README-zh-Hans.md
+++ b/README-zh-Hans.md
@ -1,4 +1,4 @@
-# InternLM 
+# InternLM

 <div align="center">

@ -24,15 +24,15 @@
 [🆕Update News](./CHANGE_LOG.md) |
 [🤔Reporting Issues](https://github.com/InternLM/InternLM/issues/new)

-
 [English](./README.md) |
-[简体中文](./README-zh-Hans.md) 
-
+[简体中文](./README-zh-Hans.md)

 </div>

 ## 简介
+
 InternLM ，即书生·浦语大模型，包含面向实用场景的70亿参数基础模型与对话模型 （InternLM-7B）。模型具有以下特点：
+
 - 使用上万亿高质量预料，建立模型超强知识体系；
 - 支持8k语境窗口长度，实现更长输入与更强推理体验；
 - 通用工具调用能力，支持用户灵活自助搭建流程；
@ -43,25 +43,26 @@ InternLM ，即书生·浦语大模型，包含面向实用场景的70亿参数

 ### 性能评测

-我们使用开源评测工具 [OpenCompass](https://github.com/internLM/OpenCompass/) 从学科综合能力、语言能力、知识能力、推理能力、理解能力五大能力维度对InternLM开展全面评测，部分评测结果如下表所示，欢迎访问[ OpenCompass 榜单 ](https://opencompass.org.cn/rank)获取更多的评测结果。
+我们使用开源评测工具 [OpenCompass](https://github.com/internLM/OpenCompass/) 从学科综合能力、语言能力、知识能力、推理能力、理解能力五大能力维度对InternLM开展全面评测，部分评测结果如下表所示，欢迎访问[OpenCompass 榜单](https://opencompass.org.cn/rank)获取更多的评测结果。

-| 数据集\模型           |  **InternLM-Chat-7B** |  **InternLM-7B**  |  LLaMA-7B | Baichuan-7B | ChatGLM2-6B | Alpaca-7B | Vicuna-7B |   
+| 数据集\模型           |  **InternLM-Chat-7B** |  **InternLM-7B**  |  LLaMA-7B | Baichuan-7B | ChatGLM2-6B | Alpaca-7B | Vicuna-7B |
 | -------------------- | --------------------- | ---------------- | --------- |  --------- | ------------ | --------- | ---------- |  
 | C-Eval(Val)          |      53.2             |        53.4       | 24.2      | 42.7       |  50.9       |  28.9     | 31.2     |
 | MMLU                 |      50.8             |       51.0        | 35.2*     |  41.5      |  46.0       |  39.7     | 47.3     |
 | AGIEval              |      42.5             |       37.6        | 20.8      | 24.6       |  39.0       | 24.1      | 26.4     |
 | CommonSenseQA        |      75.2             |      59.5         | 65.0      | 58.8       | 60.0        | 68.7      | 66.7     |
 | BUSTM                |      74.3             |       50.6        | 48.5      | 51.3        | 55.0        | 48.8      | 62.5     |
-| CLUEWSC              |      78.6             |      59.1         |  50.3     |  52.8     |  59.8     |   50.3    |  52.2     | 
+| CLUEWSC              |      78.6             |      59.1         |  50.3     |  52.8     |  59.8     |   50.3    |  52.2     |
 | MATH                 |      6.4            |         7.1        |  2.8       | 3.0       | 6.6       |  2.2      | 2.8       |
 | GSM8K                |      34.5           |        31.2        | 10.1       | 9.7       | 29.2      |  6.0      | 15.3  |
 |  HumanEval           |      14.0           |        10.4        |   14.0     | 9.2       | 9.2       | 9.2       | 11.0  |
-| RACE(High)           |      76.3           |        57.4        | 46.9*      | 28.1      | 66.3      | 40.7      | 54.0  | 
+| RACE(High)           |      76.3           |        57.4        | 46.9*      | 28.1      | 66.3      | 40.7      | 54.0  |

 - 以上评测结果基于 [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) 获得（部分数据标注`*`代表数据来自原始论文），具体测试细节可参见 [OpenCompass](https://github.com/internLM/OpenCompass/) 中提供的配置文件。
 - 评测数据会因 [OpenCompass](https://github.com/internLM/OpenCompass/) 的版本迭代而存在数值差异，请以 [OpenCompass](https://github.com/internLM/OpenCompass/) 最新版的评测结果为主。

 ### Model Zoo
+
 当前通过 InternLM 训练的 InternLM 7B 和 InternLM 7B Chat 已经开源，我们提供两种格式的模型权重以供使用。除了使用 Transformers 格式加载模型之外，还可以通过 InternLM 加载以下格式的权重直接进行继续预训练或人类偏好对齐训练

 | 模型                 | InternLM 格式权重下载地址                                                                                                                      | Transformers 格式权重下载地址                    |
@ -70,11 +71,12 @@ InternLM ，即书生·浦语大模型，包含面向实用场景的70亿参数
 | **InternLM Chat 7B** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b) | [🤗internlm/intern-chat-7b](https://huggingface.co/internlm/internlm-chat-7b)
 | **InternLM Chat 7B 8k** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-8k) | [🤗internlm/intern-chat-7b-8k](https://huggingface.co/internlm/internlm-chat-7b-8k)

-
 **局限性：** 尽管在训练过程中我们非常注重模型的安全性，尽力促使模型输出符合伦理和法律要求的文本，但受限于模型大小以及概率生成范式，模型可能会产生各种不符合预期的输出，例如回复内容包含偏见、歧视等有害内容，请勿传播这些内容。由于传播不良信息导致的任何后果，本项目不承担责任。

 ### 通过 Transformers 加载
+
 通过以下的代码加载 InternLM 7B Chat 模型
+
 ```python
 >>> from transformers import AutoTokenizer, AutoModelForCausalLM
 >>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
@ -92,13 +94,16 @@ InternLM ，即书生·浦语大模型，包含面向实用场景的70亿参数
 ```

 ### 通过前端网页对话
+
 可以通过以下代码启动一个前端的界面来与 InternLM Chat 7B 模型进行交互
+
 ```bash
 pip install streamlit==1.24.0
 pip install transformers==4.30.2
 streamlit run web_demo.py
 ```
-效果如下 
+
+效果如下

 ![效果](https://github.com/InternLM/InternLM/assets/9102141/11b60ee0-47e4-42c0-8278-3051b2f17fe4)

@ -129,23 +134,28 @@ streamlit run web_demo.py
 ## 微调&训练

 ### 预训练与微调使用教程
+
 请参考[使用教程](./doc/usage.md)开始InternLM的安装、数据处理、预训练与微调。

 ### 转换为 Transformers 格式使用
+
 通过 InternLM 进行训练的模型可以很轻松地转换为 HuggingFace Transformers 格式，方便与社区各种开源项目无缝对接。借助 `tools/convert2hf.py` 可以将训练保存的权重一键转换为 transformers 格式
+
 ```bash
 python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer tokenizes/tokenizer.model
 ```
-转换之后可以通过以下的代码加载为 transformers 
+
+转换之后可以通过以下的代码加载为 transformers
+
 ```python
 >>> from transformers import AutoTokenizer, AutoModel
 >>> model = AutoModel.from_pretrained("hf_ckpt/", trust_remote_code=True).cuda()
 ```

-
 ## 训练系统

 ### 系统结构
+
 请参考[系统结构文档](./doc/structure.md)进一步了解。

 ### 训练性能
@ -159,7 +169,6 @@ InternLM 深度整合了 Flash-Attention, Apex 等高性能模型算子，提高

 TKS 代表平均每GPU每秒可以处理的 Token 数量。更多的性能测试数据可参考[训练性能文档](./doc/train_performance.md)进一步了解。

-
 ## 贡献

 我们感谢所有的贡献者为改进和提升 InternLM 所作出的努力。非常欢迎社区用户能参与进项目中来。请参考贡献指南来了解参与项目贡献的相关指引。
@ -169,13 +178,16 @@ TKS 代表平均每GPU每秒可以处理的 Token 数量。更多的性能测试
 InternLM 代码库是一款由上海人工智能实验室和来自不同高校、企业的研发人员共同参与贡献的开源项目。我们感谢所有为项目提供新功能支持的贡献者，以及提供宝贵反馈的用户。 我们希望这个工具箱和基准测试可以为社区提供灵活高效的代码工具，供用户微调 InternLM 并开发自己的新模型，从而不断为开源社区提供贡献。特别鸣谢[flash-attention](https://github.com/HazyResearch/flash-attention) 与 [ColossalAI](https://github.com/hpcaitech/ColossalAI) 两项开源项目。

 ## 开源许可证
- 
-本仓库的代码依照 Apache-2.0 协议开源。InternLM 权重对学术研究完全开放，在获得官方的书面许可后，亦允许商业使用。申请商用许可与合作请联系 internlm@pjlab.org.cn。
+
+本仓库的代码依照 Apache-2.0 协议开源。InternLM 权重对学术研究完全开放，在获得官方的书面许可后，亦允许商业使用。申请商用许可与合作请联系 <internlm@pjlab.org.cn>。

 ## 引用
+
+```
@misc{2023internlm,
    title={InternLM: A Multilingual Language Model with Progressively Enhanced Capabilities},
    author={InternLM Team},
    howpublished = {\url{https://github.com/InternLM/InternLM}},
    year={2023}
-}
+}
+```
--- a/README.md
+++ b/README.md
@ -1,9 +1,9 @@
-# InternLM 
+# InternLM

 <div align="center">

 <img src="./doc/imgs/logo.svg" width="200"/>
-  <div>&nbsp;</div>
+  <div> </div>
  <div align="center">
    <b><font size="5">InternLM</font></b>
    <sup>
@ -11,7 +11,7 @@
        <i><font size="4">HOT</font></i>
      </a>
    </sup>
-    <div>&nbsp;</div>
+    <div> </div>
  </div>

 [![license](./doc/imgs/license.svg)](./LICENSE)
@ -24,16 +24,15 @@
 [🆕Update News](./CHANGE_LOG.md) |
 [🤔Reporting Issues](https://github.com/InternLM/InternLM/issues/new)

-
 [English](./README.md) |
-[简体中文](./README-zh-Hans.md) 
-
+[简体中文](./README-zh-Hans.md)

 </div>

 ## Introduction

 InternLM has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics:
+
 - It leverages trillions of high-quality tokens for training to establish a powerful knowledge base.
 - It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities.
 - It provides a versatile toolset for users to flexibly build their own workflows.
@ -46,36 +45,38 @@ Additionally, a lightweight training framework is offered to support model pre-t

 We conducted a comprehensive evaluation of InternLM using the open-source evaluation tool [OpenCompass](https://github.com/internLM/OpenCompass/). The evaluation covered five dimensions of capabilities: disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence. Here are some of the evaluation results, and you can visit the [OpenCompass leaderboard](https://opencompass.org.cn/rank) for more evaluation results.

-| Datasets\Models           |  **InternLM-Chat-7B** |  **InternLM-7B**  |  LLaMA-7B | Baichuan-7B | ChatGLM2-6B | Alpaca-7B | Vicuna-7B |   
-| -------------------- | --------------------- | ---------------- | --------- |  --------- | ------------ | --------- | ---------- |  
-| C-Eval(Val)          |      53.2             |        53.4       | 24.2      | 42.7       |  50.9       |  28.9     | 31.2     |
-| MMLU                 |      50.8             |       51.0        | 35.2*     |  41.5      |  46.0       |  39.7     | 47.3     |
-| AGIEval              |      42.5             |       37.6        | 20.8      | 24.6       |  39.0       | 24.1      | 26.4     |
-| CommonSenseQA        |      75.2             |      59.5         | 65.0      | 58.8       | 60.0        | 68.7      | 66.7     |
-| BUSTM                |      74.3             |       50.6        | 48.5      | 51.3        | 55.0        | 48.8      | 62.5     |
-| CLUEWSC              |      78.6             |      59.1         |  50.3     |  52.8     |  59.8     |   50.3    |  52.2     | 
-| MATH                 |      6.4            |         7.1        |  2.8       | 3.0       | 6.6       |  2.2      | 2.8       |
-| GSM8K                |      34.5           |        31.2        | 10.1       | 9.7       | 29.2      |  6.0      | 15.3  |
-|  HumanEval           |      14.0           |        10.4        |   14.0     | 9.2       | 9.2       | 9.2       | 11.0  |
-| RACE(High)           |      76.3           |        57.4        | 46.9*      | 28.1      | 66.3      | 40.7      | 54.0  | 
+| Datasets\Models | **InternLM-Chat-7B** | **InternLM-7B** | LLaMA-7B | Baichuan-7B | ChatGLM2-6B | Alpaca-7B | Vicuna-7B |
+| --------------- | -------------------------- | --------------------- | -------- | ----------- | ----------- | --------- | --------- |
+| C-Eval(Val)     | 53.2                       | 53.4                  | 24.2     | 42.7        | 50.9        | 28.9      | 31.2      |
+| MMLU            | 50.8                       | 51.0                  | 35.2*    | 41.5        | 46.0        | 39.7      | 47.3      |
+| AGIEval         | 42.5                       | 37.6                  | 20.8     | 24.6        | 39.0        | 24.1      | 26.4      |
+| CommonSenseQA   | 75.2                       | 59.5                  | 65.0     | 58.8        | 60.0        | 68.7      | 66.7      |
+| BUSTM           | 74.3                       | 50.6                  | 48.5     | 51.3        | 55.0        | 48.8      | 62.5      |
+| CLUEWSC         | 78.6                       | 59.1                  | 50.3     | 52.8        | 59.8        | 50.3      | 52.2      |
+| MATH            | 6.4                        | 7.1                   | 2.8      | 3.0         | 6.6         | 2.2       | 2.8       |
+| GSM8K           | 34.5                       | 31.2                  | 10.1     | 9.7         | 29.2        | 6.0       | 15.3      |
+| HumanEval       | 14.0                       | 10.4                  | 14.0     | 9.2         | 9.2         | 9.2       | 11.0      |
+| RACE(High)      | 76.3                       | 57.4                  | 46.9*    | 28.1        | 66.3        | 40.7      | 54.0      |

- The evaluation results were obtained from [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) (some data marked with *, which means come from the original papers), and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/internLM/OpenCompass/). 
+- The evaluation results were obtained from [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) (some data marked with *, which means come from the original papers), and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/internLM/OpenCompass/).
 - The evaluation data may have numerical differences due to the version iteration of [OpenCompass](https://github.com/internLM/OpenCompass/), so please refer to the latest evaluation results of [OpenCompass](https://github.com/internLM/OpenCompass/).

 ### Model Zoo
+
 InternLM 7B and InternLM 7B Chat, trained using InternLM, have been open-sourced. We provide two formats of model weights for use. In addition to loading the models using the Transformers format, you can also load the weights directly using InternLM for further pre-training or human preference alignment training.

-| Model                 | InternLM Format Weight Download Link                                                                                                                      | Transformers Format Weight Download Link                    |
-| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ |
-| **InternLM 7B**      | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-7b) | [🤗internlm/intern-7b](https://huggingface.co/internlm/internlm-7b) |
-| **InternLM Chat 7B** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b) | [🤗internlm/intern-chat-7b](https://huggingface.co/internlm/internlm-chat-7b) |
-| **InternLM Chat 7B 8k** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-8k) | [🤗internlm/intern-chat-7b-8k](https://huggingface.co/internlm/internlm-chat-7b-8k)
-
+| Model                         | InternLM Format Weight Download Link                                                                                                                 | Transformers Format Weight Download Link                                         |
+| ----------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
+| **InternLM 7B**         | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-7b)         | [🤗internlm/intern-7b](https://huggingface.co/internlm/internlm-7b)                 |
+| **InternLM Chat 7B**    | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b)    | [🤗internlm/intern-chat-7b](https://huggingface.co/internlm/internlm-chat-7b)       |
+| **InternLM Chat 7B 8k** | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b-8k) | [🤗internlm/intern-chat-7b-8k](https://huggingface.co/internlm/internlm-chat-7b-8k) |

 **Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

 ### Import from Transformers
+
 To load the InternLM 7B Chat model using Transformers, use the following code:
+
 ```python
 >>> from transformers import AutoTokenizer, AutoModelForCausalLM
 >>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
@ -96,75 +97,81 @@ Remember, good time management skills take practice and patience. Start with sma
 ```

 ### Dialogue
+
 You can interact with the InternLM Chat 7B model through a frontend interface by running the following code:
+
 ```bash
 pip install streamlit==1.24.0
 pip install transformers==4.30.2
 streamlit run web_demo.py
 ```
+
 The effect is as follows

 ![demo](https://github.com/InternLM/InternLM/assets/9102141/11b60ee0-47e4-42c0-8278-3051b2f17fe4)

-
 ### Deployment

 We use [LMDeploy](https://github.com/InternLM/LMDeploy) to complete the one-click deployment of InternLM.

 1. First, install LMDeploy:

-  ```
+```
  python3 -m pip install lmdeploy
-  ```
+```

 2. Use the following command for quick deployment:

-  ```
+```
  python3 -m lmdeploy.serve.turbomind.deploy InternLM-7B /path/to/internlm-7b/model hf
-  ```
+```

 3. After exporting the model, you can start a server and have a conversation with the deployed model using the following command:

-  ```
+```
  python3 -m lmdeploy.serve.client {server_ip_addresss}:33337
-  ```
+```

 [LMDeploy](https://github.com/InternLM/LMDeploy) provides a complete workflow for deploying InternLM. Please refer to the [deployment tutorial](https://github.com/InternLM/LMDeploy) for more details on deploying InternLM.

 ## Fine-tuning & Training

 ### Pre-training and Fine-tuning Tutorial
+
 Please refer to [Usage Tutorial](./doc/en/usage.md) to start InternLM installation, data processing, pre-training and fine-tuning.

 ### Convert to Transformers Format
+
 The model trained by InternLM can be easily converted to HuggingFace Transformers format, which is convenient for seamless docking with various open source projects in the community. With the help of `tools/convert2hf.py`, the weights saved during training can be converted into transformers format with one command
+
 ```bash
 python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer tokenizes/tokenizer.model
 ```
+
 After conversion, it can be loaded as transformers by the following code
+
 ```python
 >>> from transformers import AutoTokenizer, AutoModel
 >>> model = AutoModel.from_pretrained("hf_ckpt/", trust_remote_code=True).cuda()
 ```

-
 ## Training System

 ### System Architecture
+
 Please refer to the [System Architecture document](./doc/en/structure.md) for further details.

 ### Training Performance

 InternLM deeply integrates Flash-Attention, Apex and other high-performance model operators to improve training efficiency. By building the Hybrid Zero technique, it achieves efficient overlap of computation and communication, significantly reducing cross-node communication traffic during training. InternLM supports expanding the 7B model from 8 GPUs to 1024 GPUs, with an acceleration efficiency of up to 90% at the thousand-GPU scale, a training throughput of over 180 TFLOPS, and an average of over 3600 tokens per GPU per second. The following table shows InternLM's scalability test data at different configurations:

-| Number of GPUs | 8  | 16  | 32  | 64  | 128  | 256  | 512  | 1024  |
-| -------------- | ------ | ------- | ------- | ------- | -------- | -------- | -------- | --------- |
-| TGS            | 4078   | 3939    | 3919    | 3944    | 3928     | 3920     | 3835     | 3625      |
-| TFLOPS         | 192    | 192     | 186     | 186     | 185      | 185      | 186      | 182       |
+| Number of GPUs | 8    | 16   | 32   | 64   | 128  | 256  | 512  | 1024 |
+| -------------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+| TGS            | 4078 | 3939 | 3919 | 3944 | 3928 | 3920 | 3835 | 3625 |
+| TFLOPS         | 192  | 192  | 186  | 186  | 185  | 185  | 186  | 182  |

 TGS represents the average number of tokens processed per GPU per second. For more performance test data, please refer to the [Training Performance document](./doc/en/train_performance.md) for further details.

-
 ## Contribution

 We appreciate all the contributors for their efforts to improve and enhance InternLM. Community users are highly encouraged to participate in the project. Please refer to the contribution guidelines for instructions on how to contribute to the project.
@ -175,12 +182,15 @@ InternLM codebase is an open-source project contributed by Shanghai AI Laborator

 ## Open Source License

-The code in this repository is open-source under the Apache-2.0 license. The InternLM weights are fully open for academic research and also allow commercial use with written permission from the official team. For inquiries about commercial licenses and collaborations, please contact internlm@pjlab.org.cn.
+The code in this repository is open-source under the Apache-2.0 license. The InternLM weights are fully open for academic research and also allow commercial use with written permission from the official team. For inquiries about commercial licenses and collaborations, please contact <internlm@pjlab.org.cn>.

 ## Citation
+
+```
@misc{2023internlm,
    title={InternLM: A Multilingual Language Model with Progressively Enhanced Capabilities},
    author={InternLM Team},
    howpublished = {\url{https://github.com/InternLM/InternLM}},
    year={2023}
-}
+}
+```