Merge branch 'update-chat' into 'new-main'

Update chat format doc See merge request openmmlab/bigmodel/InternLM!7
2024-01-16 16:25:05 +00:00 · 2024-01-16 16:25:05 +00:00 · 8ff20ff63a
parent e3a7b82533 ac8cf42e7d
commit 8ff20ff63a
10 changed files with 386 additions and 519 deletions
--- a/README-ja-JP.md
+++ b/README-ja-JP.md
@ -1,202 +0,0 @@
-# InternLM
-
-<div align="center">
-
-<img src="./docs/imgs/logo.svg" width="200"/>
-  <div> </div>
-  <div align="center">
-    <b><font size="5">InternLM</font></b>
-    <sup>
-      <a href="https://internlm.intern-ai.org.cn/">
-        <i><font size="4">HOT</font></i>
-      </a>
-    </sup>
-    <div> </div>
-  </div>
-
-[![license](./docs/imgs/license.svg)](./LICENSE)
-[![evaluation](./docs/imgs/compass_support.svg)](https://github.com/internLM/OpenCompass/)
-[![Documentation Status](https://readthedocs.org/projects/internlm/badge/?version=latest)](https://internlm.readthedocs.io/zh_CN/latest/?badge=latest)
-
-[📘使用法](./doc/en/usage.md) |
-[🛠️インストール](./doc/en/install.md) |
-[📊トレーニングパフォーマンス](./doc/en/train_performance.md) |
-[👀モデル](#model-zoo) |
-[🆕更新ニュース](./CHANGE_LOG.md) |
-[🤔Issues 報告](https://github.com/InternLM/InternLM/issues/new)
-
-[English](./README.md) |
-[简体中文](./README-zh-Hans.md) |
-[日本語](./README-ja-JP.md)
-
-</div>
-
-## はじめに
-
-InternLM は、70 億のパラメータを持つベースモデルと、実用的なシナリオに合わせたチャットモデルをオープンソース化しています。このモデルには以下の特徴があります:
-
- 何兆もの高品質なトークンをトレーニングに活用し、強力な知識ベースを確立します。
- 8k のコンテキストウィンドウ長をサポートし、より長い入力シーケンスと強力な推論機能を可能にする。
- ユーザが独自のワークフローを柔軟に構築できるよう、汎用性の高いツールセットを提供します。
-
-さらに、大規模な依存関係を必要とせずにモデルの事前学習をサポートする軽量な学習フレームワークが提供されます。単一のコードベースで、数千の GPU を持つ大規模クラスタでの事前学習と、単一の GPU での微調整をサポートし、顕著な性能最適化を達成します。InternLM は、1024GPU でのトレーニングにおいて 90% 近いアクセラレーション効率を達成しています。
-
-## 新闻
-
-[20231213] InternLM-7B-Chat および InternLM-20B-Chat のモデルの重みを更新しました。 新しいバージョンの会話モデルでは、より高品質でより多様な言語スタイルの応答を生成できます。
-[20230920] 基本版と会話版を含むInternLM-20Bをリリースしました。
-
-## InternLM-7B
-
-### パフォーマンス評価
-
-オープンソースの評価ツール [OpenCompass](https://github.com/internLM/OpenCompass/) を用いて、InternLM の総合的な評価を行った。この評価では、分野別能力、言語能力、知識能力、推論能力、理解能力の 5 つの次元をカバーしました。以下は評価結果の一部であり、その他の評価結果については [OpenCompass leaderboard](https://opencompass.org.cn/rank) をご覧ください。
-
-| データセット\モデル | **InternLM-Chat-7B** | **InternLM-7B** | LLaMA-7B | Baichuan-7B | ChatGLM2-6B | Alpaca-7B | Vicuna-7B |
-| --------------- | -------------------------- | --------------------- | -------- | ----------- | ----------- | --------- | --------- |
-| C-Eval(Val)     | 52.0                       | 53.4                  | 24.2     | 42.7        | 50.9        | 28.9      | 31.2      |
-| MMLU            | 52.6                       | 51.0                  | 35.2*    | 41.5        | 46.0        | 39.7      | 47.3      |
-| AGIEval         | 46.4                       | 37.6                  | 20.8     | 24.6        | 39.0        | 24.1      | 26.4      |
-| CommonSenseQA   | 80.8                       | 59.5                  | 65.0     | 58.8        | 60.0        | 68.7      | 66.7      |
-| BUSTM           | 80.6                       | 50.6                  | 48.5     | 51.3        | 55.0        | 48.8      | 62.5      |
-| CLUEWSC         | 81.8                       | 59.1                  | 50.3     | 52.8        | 59.8        | 50.3      | 52.2      |
-| MATH            | 5.0                        | 7.1                   | 2.8      | 3.0         | 6.6         | 2.2       | 2.8       |
-| GSM8K           | 36.2                       | 31.2                  | 10.1     | 9.7         | 29.2        | 6.0       | 15.3      |
-| HumanEval       | 15.9                       | 10.4                  | 14.0     | 9.2         | 9.2         | 9.2       | 11.0      |
-| RACE(High)      | 80.3                       | 57.4                  | 46.9*    | 28.1        | 66.3        | 40.7      | 54.0      |
-
- 評価結果は [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) (*印のあるデータは原著論文からの引用を意味する)から取得したもので、評価設定は [OpenCompass](https://github.com/internLM/OpenCompass/) が提供する設定ファイルに記載されています。
- 評価データは、[OpenCompass](https://github.com/internLM/OpenCompass/) のバージョンアップにより数値的な差異が生じる可能性がありますので、[OpenCompass](https://github.com/internLM/OpenCompass/) の最新の評価結果をご参照ください。
-
-### Model Zoo
-
-InternLM 7B と InternLM 7B チャットは、InternLM を使って訓練され、オープンソース化されています。モデルの重みは 2 つのフォーマットで提供されています。Transformers フォーマットを使ってモデルをロードするだけでなく、InternLM を使って直接重みをロードして、さらに事前トレーニングや人間の好みアライメントトレーニングを行うこともできます。
-
-| モデル                         | InternLM フォーマット Weight ダウンロードリンク                                                                                                                 | Transformers フォーマット Weight ダウンロードリンク                                         |
-| ----------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
-| **InternLM 7B**         | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-7b)         | [🤗internlm/intern-7b](https://huggingface.co/internlm/internlm-7b)                 |
-| **InternLM Chat 7B**    | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b)    | [🤗internlm/intern-chat-7b](https://huggingface.co/internlm/internlm-chat-7b)       |
-
-**制限事項:** 学習過程におけるモデルの安全性を確保し、倫理的・法的要件に準拠したテキストを生成するようモデルに促す努力を行ってきたが、モデルのサイズと確率的生成パラダイムのため、モデルは依然として予期せぬ出力を生成する可能性がある。例えば、生成された回答には偏見や差別、その他の有害な内容が含まれている可能性があります。そのような内容を伝播しないでください。有害な情報の伝播によって生じるいかなる結果に対しても、私たちは責任を負いません。
-
-### Transformers からのインポート
-
-Transformers を使用して InternLM 7B チャットモデルをロードするには、以下のコードを使用します:
-
-```python
->>> from transformers import AutoTokenizer, AutoModelForCausalLM
->>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
->>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True).cuda()
->>> model = model.eval()
->>> response, history = model.chat(tokenizer, "こんにちは", history=[])
->>> print(response)
-こんにちは！どのようにお手伝いできますか？
->>> response, history = model.chat(tokenizer, "時間管理について3つの提案をお願いします", history=history)
->>> print(response)
-もちろんです！以下に簡潔な形で時間管理に関する3つの提案を示します。
-
-1. To-Doリストを作成し、優先順位を付ける: タスクを明確にリストアップし、それぞれの優先度を判断しましょう。重要で緊急なタスクから順に取り組むことで、効率的に作業を進めることができます。
-2. 時間のブロック化を実践する: 作業を特定の時間枠に集中させるため、時間をブロック化しましょう。例えば、朝の2時間をメール対応に割り当て、午後の3時間をプロジェクトに集中するなど、タスクごとに時間を確保することが効果的です。
-3. ディストラクションを排除する: 集中力を保つために、ディストラクションを最小限に抑えましょう。通知をオフにし、SNSやメールに気を取られないようにすることで、作業効率を向上させることができます。
-
-これらの提案を実践することで、時間管理のスキルを向上させ、効果的に日々のタスクをこなしていくことができます。
-```
-
-### 対話
-
-以下のコードを実行することで、フロントエンドインターフェースを通して InternLM Chat 7B モデルと対話することができます:
-
-```bash
-pip install streamlit==1.24.0
-pip install transformers==4.30.2
-streamlit run web_demo.py
-```
-
-その効果は以下の通り
-
-![demo](https://github.com/InternLM/InternLM/assets/9102141/11b60ee0-47e4-42c0-8278-3051b2f17fe4)
-
-### デプロイ
-
-[LMDeploy](https://github.com/InternLM/LMDeploy) を使って、InternLM をワンクリックでデプロイする。
-
-1. まず、LMDeploy をインストールする:
-
-```shell
-python3 -m pip install lmdeploy
-```
-
-2. クイックデプロイには以下のコマンドを使用します:
-
-```shell
-lmdeploy chat turbomind InternLM/internlm-chat-7b --model-name internlm-chat-7b
-```
-
-3. モデルをエクスポートした後、以下のコマンドを使ってサーバーを起動し、デプロイされたモデルと会話することができます:
-
-```shell
-lmdeploy serve api_server InternLM/internlm-chat-7b --model-name internlm-chat-7b
-```
-
-[LMDeploy](https://github.com/InternLM/LMDeploy) は、InternLM をデプロイするための完全なワークフローを提供します。InternLM のデプロイの詳細については、[デプロイチュートリアル](https://github.com/InternLM/LMDeploy)を参照してください。
-
-## ファインチューニングとトレーニング
-
-### プリトレーニングとファインチューニングのチュートリアル
-
-InternLMのインストール、データ処理、プレトレーニング、ファインチューニングを始めるには、[使用法チュートリアル](./doc/ja/usage.md)を参照してください。
-
-### Transformers フォーマットへの変換
-
-InternLM によって学習されたモデルは、コミュニティの様々なオープンソースプロジェクトとシームレスにドッキングするのに便利な Hugging Face Transformers 形式に簡単に変換することができます。`tools/convert2hf.py` の助けを借りて、トレーニング中に保存された weights は 1 つのコマンドで transformers 形式に変換することができます
-
-```bash
-python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer tokenizes/tokenizer.model
-```
-
-変換後、以下のコードで transformers として読み込むことができます
-
-```python
->>> from transformers import AutoTokenizer, AutoModel
->>> model = AutoModel.from_pretrained("hf_ckpt/", trust_remote_code=True).cuda()
-```
-
-## トレーニングシステム
-
-### システムアーキテクチャ
-
-詳細については、[システムアーキテクチャドキュメント](./doc/ja/structure.md) を参照してください。
-
-### トレーニングパフォーマンス
-
-InternLM は、Flash-Attention、Apex その他の高性能モデルオペレータを深く統合し、トレーニング効率を向上させます。Hybrid Zero 技術を構築することで、計算と通信の効率的なオーバーラップを実現し、トレーニング中のノード間の通信トラフィックを大幅に削減します。InternLM は 7B モデルを 8GPU から 1024GPU まで拡張することをサポートし、1000GPU スケールで最大 90% のアクセラレーション効率、180TFLOPS 以上のトレーニングスループット、GPU あたり平均 3600 トークン/秒以上を実現します。次の表は、異なる構成における InternLM のスケーラビリティテストデータです:
-
-| GPU Number         | 8   | 16  | 32  | 64  | 128  | 256  | 512  | 1024  |
-| ---------------- | ---- | ---- | ---- | ---- | ----- | ----- | ----- | ------ |
-| TGS | 4078 | 3939 | 3919 | 3944 | 3928  | 3920  | 3835  | 3625   |
-| TFLOPS  | 193 | 191  | 188  | 188  | 187   | 185   | 186   | 184    |
-
-TGSは、GPUあたり1秒間に処理されるトークンの平均数を表します。パフォーマンステストデータの詳細については、[トレーニングパフォーマンスドキュメント](./doc/ja/train_performance.md)を参照してください。
-
-## コントリビュート
-
-我々は、InternLM を改善し、向上させるために尽力してくれたすべての貢献者に感謝している。コミュニティ・ユーザーのプロジェクトへの参加が強く推奨されます。プロジェクトへの貢献方法については、貢献ガイドラインを参照してください。
-
-## 謝辞
-
-InternLM コードベースは、上海 AI 研究所と様々な大学や企業の研究者によって貢献されたオープンソースプロジェクトです。プロジェクトに新機能を追加してくれたすべての貢献者と、貴重なフィードバックを提供してくれたユーザーに感謝したい。私たちは、このツールキットとベンチマークが、InternLM をファインチューニングし、独自のモデルを開発するための柔軟で効率的なコードツールをコミュニティに提供し、オープンソースコミュニティに継続的に貢献できることを願っています。2 つのオープンソースプロジェクト、[flash-attention](https://github.com/HazyResearch/flash-attention) と [ColossalAI](https://github.com/hpcaitech/ColossalAI) に感謝します。
-
-## ライセンス
-
-コードは Apache-2.0 でライセンスされており、モデルの重さは学術研究のために完全にオープンで、**無料** の商用利用も許可されています。商用ライセンスの申請は、[申請フォーム（英語）](https://wj.qq.com/s2/12727483/5dba/)/[申请表（中文）](https://wj.qq.com/s2/12725412/f7c1/)にご記入ください。その他のご質問やコラボレーションについては、<internlm@pjlab.org.cn> までご連絡ください。
-
-## 引用
-
-```
-@misc{2023internlm,
-    title={InternLM: A Multilingual Language Model with Progressively Enhanced Capabilities},
-    author={InternLM Team},
-    howpublished = {\url{https://github.com/InternLM/InternLM}},
-    year={2023}
-}
-```
--- a/README-zh-Hans.md
+++ b/README-zh-Hans.md
@ -1,290 +0,0 @@
-# InternLM
-
-<div align="center">
-
-<img src="./docs/imgs/logo.svg" width="200"/>
-  <div>&nbsp;</div>
-  <div align="center">
-    <b><font size="5">书生·浦语 官网</font></b>
-    <sup>
-      <a href="https://internlm.intern-ai.org.cn/">
-        <i><font size="4">HOT</font></i>
-      </a>
-    </sup>
-    <div>&nbsp;</div>
-  </div>
-
-[![license](./docs/imgs/license.svg)](https://github.com/open-mmlab/mmdetection/blob/main/LICENSE)
-[![evaluation](./docs/imgs/compass_support.svg)](https://github.com/internLM/OpenCompass/)
-[![Documentation Status](https://readthedocs.org/projects/internlm/badge/?version=latest)](https://internlm.readthedocs.io/zh_CN/latest/?badge=latest)
-
-[📘使用文档](./doc/usage.md) |
-[🛠️安装教程](./doc/install.md) |
-[📊训练性能](./doc/train_performance.md) |
-[👀模型库](#model-zoo) |
-[🤗HuggingFace](https://huggingface.co/spaces/internlm/InternLM-Chat-7B) |
-[🆕Update News](./CHANGE_LOG.md) |
-[🤔Reporting Issues](https://github.com/InternLM/InternLM/issues/new)
-
-[English](./README.md) |
-[简体中文](./README-zh-Hans.md) |
-[日本語](./README-ja-JP.md)
-
-</div>
-
-<p align="center">
-    👋 加入我们的 <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a> 和 <a href="https://github.com/InternLM/InternLM/assets/25839884/a6aad896-7232-4220-ac84-9e070c2633ce" target="_blank">微信社区</a>
-</p>
-
-## 简介
-
-InternLM 是一个开源的轻量级训练框架，旨在支持大模型训练而无需大量的依赖。通过单一的代码库，它支持在拥有数千个 GPU 的大型集群上进行预训练，并在单个 GPU 上进行微调，同时实现了卓越的性能优化。在1024个 GPU 上训练时，InternLM 可以实现近90%的加速效率。
-
-基于InternLM训练框架，我们已经发布了两个开源的预训练模型：InternLM-7B 和 InternLM-20B。
-
-## 更新
-
-[20231213] 我们更新了 InternLM-7B-Chat 和 InternLM-20B-Chat 模型权重。通过改进微调数据和训练策略，新版对话模型生成的回复质量更高、语言风格更加多元。
-[20230920] InternLM-20B 已发布，包括基础版和对话版。
-
-## Model Zoo
-
-我们的模型在三个平台上发布：Transformers、ModelScope 和 OpenXLab。
-
-| Model                     | Transformers                        | ModelScope                                                                                                                        | OpenXLab                                                                              |发布日期 |
-|---------------------------|------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
-| **InternLM Chat 20B**     | [🤗internlm/internlm-chat-20b](https://huggingface.co/internlm/internlm-20b-chat)         | [<img src="./docs/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b-chat/summary)         | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-20b)     | 2023-12-12   |
-| **InternLM 20B**          | [🤗internlm/internlm-20b](https://huggingface.co/internlm/internlm-20b)                   | [<img src="./docs/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b/summary)                   | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-20b)          | 2023-09-20   |
-| **InternLM Chat 7B**      | [🤗internlm/internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b)           | [<img src="./docs/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b/summary)           | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-chat-7b)      | 2023-12-12   |
-| **InternLM 7B**           | [🤗internlm/internlm-7b](https://huggingface.co/internlm/internlm-7b)                     | [<img src="./docs/imgs/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-7b/summary)                     | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/InternLM-7b)           | 2023-07-06   |
-
-<details>
-<summary> InternLM-20B </summary>
-
-#### 简介
-
-InternLM-20B 在超过 **2.3T** Tokens 包含高质量英文、中文和代码的数据上进行预训练，其中 Chat 版本还经过了 SFT 和 RLHF 训练，使其能够更好、更安全地满足用户的需求。
-
-InternLM 20B 在模型结构上选择了深结构，InternLM-20B 的层数设定为60层，超过常规7B和13B模型所使用的32层或者40层。在参数受限的情况下，提高层数有利于提高模型的综合能力。此外，相较于InternLM-7B，InternLM-20B使用的预训练数据经过了更高质量的清洗，并补充了高知识密度和用于强化理解和推理能力的训练数据。因此，它在理解能力、推理能力、数学能力、编程能力等考验语言模型技术水平的方面都得到了显著提升。总体而言，InternLM-20B具有以下的特点：
-
- 优异的综合性能
- 很强的工具调用功能
- 支持16k语境长度（通过推理时外推）
- 更好的价值对齐
-
-#### 性能对比
-
-在OpenCompass提出的5个能力维度上，InternLM-20B都取得很好的效果（粗体为13B-33B这个量级范围内，各项最佳成绩）
-
-| 能力维度 | Llama-13B | Llama2-13B | Baichuan2-13B | InternLM-20B | Llama-33B | Llama-65B | Llama2-70B |
-|----------|-----------|------------|---------------|--------------|-----------|-----------|------------|
-| 语言     | 42.5      | 47         | 47.5          | **55**           | 44.6      | 47.1      | 51.6       |
-| 知识     | 58.2      | 58.3       | 48.9          | 60.1         | **64**        | 66        | 67.7       |
-| 理解     | 45.5      | 50.9       | 58.1          | **67.3**         | 50.6      | 54.2      | 60.8       |
-| 推理     | 42.7      | 43.6       | 44.2          | **54.9**         | 46.4      | 49.8      | 55         |
-| 学科     | 37.3      | 45.2       | 51.8          | **62.5**         | 47.4      | 49.7      | 57.3       |
-| 总平均   | 43.8      | 47.3       | 49.4          | **59.2**         | 48.9      | 51.9      | 57.4       |
-
-下表在一些有重要影响力的典型数据集上比较了主流开源模型的表现
-
-|      | 评测集           | Llama-13B | Llama2-13B | Baichuan2-13B | InternLM-20B | Llama-33B | Llama-65B | Llama2-70B |
-|------|------------------|-----------|------------|---------------|--------------|-----------|-----------|------------|
-| 学科 | MMLU             | 47.73     | 54.99      | 59.55         | **62.05**        | 58.73     | 63.71     | 69.75      |
-|      | C-Eval (val)     | 31.83     | 41.4       | **59.01**         | 58.8         | 37.47     | 40.36     | 50.13      |
-|      | AGI-Eval         | 22.03     | 30.93      | 37.37         | **44.58**        | 33.53     | 33.92     | 40.02      |
-| 知识 | BoolQ            | 78.75     | 82.42      | 67            | **87.46**        | 84.43     | 86.61     | 87.74      |
-|      | TriviaQA         | 52.47     | 59.36      | 46.61         | 57.26        | **66.24**     | 69.79     | 70.71      |
-|      | NaturalQuestions | 20.17     | 24.85      | 16.32         | 25.15        | **30.89**     | 33.41     | 34.16      |
-| 理解 | CMRC             | 9.26      | 31.59      | 29.85         | **68.78**        | 14.17     | 34.73     | 43.74      |
-|      | CSL              | 55        | 58.75      | 63.12         | **65.62**        | 57.5      | 59.38     | 60         |
-|      | RACE (middle)    | 53.41     | 63.02      | 68.94         | **86.35**        | 64.55     | 72.35     | 81.55      |
-|      | RACE (high)      | 47.63     | 58.86      | 67.18         | **83.28**        | 62.61     | 68.01     | 79.93      |
-|      | XSum             | 20.37     | 23.37      | 25.23         | **35.54**        | 20.55     | 19.91     | 25.38      |
-| 推理 | WinoGrande       | 64.64     | 64.01      | 67.32         | **69.38**        | 66.85     | 69.38     | 69.77      |
-|      | BBH              | 37.93     | 45.62      | 48.98         | **52.51**        | 49.98     | 58.38     | 64.91      |
-|      | GSM8K            | 20.32     | 29.57      | **52.62**         | **52.62**        | 42.3      | 54.44     | 63.31      |
-|      | PIQA             | 79.71     | 79.76      | 78.07         | 80.25        | **81.34**     | 82.15     | 82.54      |
-| 编程 | HumanEval        | 14.02     | 18.9       | 17.07         | **25.61**        | 17.68     | 18.9      | 26.22      |
-|      | MBPP             | 20.6      | 26.8       | 30.8          | **35.6**         | 28.4      | 33.6      | 39.6       |
-
-总体而言，InternLM-20B 在综合能力上全面领先于13B量级的开源模型，同时在推理评测集上接近甚至超越Llama-65B的性能。
-
- 评估结果来自 [OpenCompass 20230920](https://github.com/internLM/OpenCompass/)。
- 由于 [OpenCompass](https://github.com/internLM/OpenCompass/) 的版本迭代，评估数据可能存在数值上的差异，所以请参考 [OpenCompass](https://github.com/internLM/OpenCompass/) 的最新评估结果。
-
-</details>
-
-<details>
-<summary> InternLM-7B </summary>
-
-#### 模型更新
-
-#### 简介
-
-InternLM-7B 包含了一个拥有70亿参数的基础模型和一个为实际场景量身定制的对话模型。该模型具有以下特点：
-
- 它利用数万亿的高质量令牌进行训练，建立了一个强大的知识库。
- 它支持8k的上下文窗口长度，使得输入序列更长并增强了推理能力。
- 它为用户提供了一个多功能的工具集，使用户能够灵活地构建自己的工作流程。
-
-#### 性能对比
-
-我们使用开源评测工具 [OpenCompass](https://github.com/internLM/OpenCompass/) 从学科综合能力、语言能力、知识能力、推理能力、理解能力五大能力维度对InternLM开展全面评测，部分评测结果如下表所示，欢迎访问[OpenCompass 榜单](https://opencompass.org.cn/rank)获取更多的评测结果。
-
-| 数据集\模型 | **InternLM-Chat-7B** | **InternLM-7B** | LLaMA-7B | Baichuan-7B | ChatGLM2-6B | Alpaca-7B | Vicuna-7B |
-| --------------- | -------------------------- | --------------------- | -------- | ----------- | ----------- | --------- | --------- |
-| C-Eval(Val)     | 52.0                       | 53.4                  | 24.2     | 42.7        | 50.9        | 28.9      | 31.2      |
-| MMLU            | 52.6                       | 51.0                  | 35.2*    | 41.5        | 46.0        | 39.7      | 47.3      |
-| AGIEval         | 46.4                       | 37.6                  | 20.8     | 24.6        | 39.0        | 24.1      | 26.4      |
-| CommonSenseQA   | 80.8                       | 59.5                  | 65.0     | 58.8        | 60.0        | 68.7      | 66.7      |
-| BUSTM           | 80.6                       | 50.6                  | 48.5     | 51.3        | 55.0        | 48.8      | 62.5      |
-| CLUEWSC         | 81.8                       | 59.1                  | 50.3     | 52.8        | 59.8        | 50.3      | 52.2      |
-| MATH            | 5.0                        | 7.1                   | 2.8      | 3.0         | 6.6         | 2.2       | 2.8       |
-| GSM8K           | 36.2                       | 31.2                  | 10.1     | 9.7         | 29.2        | 6.0       | 15.3      |
-| HumanEval       | 15.9                       | 10.4                  | 14.0     | 9.2         | 9.2         | 9.2       | 11.0      |
-| RACE(High)      | 80.3                       | 57.4                  | 46.9*    | 28.1        | 66.3        | 40.7      | 54.0      |
-
- 以上评测结果基于 [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) 获得（部分数据标注`*`代表数据来自原始论文），具体测试细节可参见 [OpenCompass](https://github.com/internLM/OpenCompass/) 中提供的配置文件。
- 评测数据会因 [OpenCompass](https://github.com/internLM/OpenCompass/) 的版本迭代而存在数值差异，请以 [OpenCompass](https://github.com/internLM/OpenCompass/) 最新版的评测结果为主。
-
-**局限性：** 尽管在训练过程中我们非常注重模型的安全性，尽力促使模型输出符合伦理和法律要求的文本，但受限于模型大小以及概率生成范式，模型可能会产生各种不符合预期的输出，例如回复内容包含偏见、歧视等有害内容，请勿传播这些内容。由于传播不良信息导致的任何后果，本项目不承担责任。
-
-</details>
-
-## 使用案例
-
-### 通过 Transformers 加载
-
-通过以下的代码从 Transformers 加载 InternLM 模型 （可修改模型名称替换不同的模型）
-
-```python
->>> from transformers import AutoTokenizer, AutoModelForCausalLM
->>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
->>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True).cuda()
->>> model = model.eval()
->>> response, history = model.chat(tokenizer, "你好", history=[])
->>> print(response)
-你好！有什么我可以帮助你的吗？
->>> response, history = model.chat(tokenizer, "请提供三个管理时间的建议。", history=history)
->>> print(response)
-当然可以！以下是三个管理时间的建议：
-1. 制定计划：制定一个详细的计划，包括每天要完成的任务和活动。这将有助于您更好地组织时间，并确保您能够按时完成任务。
-2. 优先级：将任务按照优先级排序，先完成最重要的任务。这将确保您能够在最短的时间内完成最重要的任务，从而节省时间。
-3. 集中注意力：避免分心，集中注意力完成任务。关闭社交媒体和电子邮件通知，专注于任务，这将帮助您更快地完成任务，并减少错误的可能性。
-```
-
-### 通过 ModelScope 加载
-
-通过以下的代码从 ModelScope 加载 InternLM 模型 （可修改模型名称替换不同的模型）
-
-```python
-from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
-import torch
-model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm-chat-7b', revision='v1.0.0')
-tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
-model = AutoModelForCausalLM.from_pretrained(model_dir,device_map="auto",  trust_remote_code=True,torch_dtype=torch.float16)
-model = model.eval()
-response, history = model.chat(tokenizer, "hello", history=[])
-print(response)
-response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
-print(response)
-```
-
-### 通过前端网页对话
-
-可以通过以下代码启动一个前端的界面来与 InternLM Chat 7B 模型进行交互
-
-```bash
-pip install streamlit==1.24.0
-pip install transformers==4.30.2
-streamlit run web_demo.py
-```
-
-效果如下
-
-![效果](https://github.com/InternLM/InternLM/assets/9102141/11b60ee0-47e4-42c0-8278-3051b2f17fe4)
-
-### 基于InternLM高性能部署
-
-我们使用 [LMDeploy](https://github.com/InternLM/LMDeploy) 完成 InternLM 的一键部署。
-
-1. 首先安装 LMDeploy:
-
-  ```shell
-  python3 -m pip install lmdeploy
-  ```
-
-2. 直接在本地，通过命令行，交互式和 InternLM 对话：
-
-  ```shell
-  lmdeploy chat turbomind InternLM/internlm-chat-7b --model-name internlm-chat-7b
-  ```
-
-1. 也可以使用如下命令启动推理服务：
-
-  ```shell
-  lmdeploy serve api_server InternLM/internlm-chat-7b --model-name internlm-chat-7b
-  ```
-
-请参考[此指南](https://github.com/InternLM/lmdeploy/blob/main/docs/en/restful_api.md)获取详细的api_server RESTful API信息，更多部署教程则可在[这里](https://github.com/InternLM/LMDeploy)找到。
-
-## 微调&训练
-
-### 预训练与微调使用教程
-
-请参考[使用教程](./doc/usage.md)开始InternLM的安装、数据处理、预训练与微调。
-
-### 转换为 Transformers 格式使用
-
-通过 InternLM 进行训练的模型可以很轻松地转换为 HuggingFace Transformers 格式，方便与社区各种开源项目无缝对接。借助 `tools/transformers/convert2hf.py` 可以将训练保存的权重一键转换为 transformers 格式
-
-```bash
-python tools/transformers/convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer ./tools/V7_sft.model
-```
-
-转换之后可以通过以下的代码加载为 transformers
-
-```python
->>> from transformers import AutoTokenizer, AutoModel
->>> model = AutoModel.from_pretrained("hf_ckpt/", trust_remote_code=True).cuda()
-```
-
-## 训练系统
-
-### 系统结构
-
-请参考[系统结构文档](./doc/structure.md)进一步了解。
-
-### 训练性能
-
-InternLM 深度整合了 Flash-Attention, Apex 等高性能模型算子，提高了训练效率。通过构建 Hybrid Zero 技术，实现计算和通信的高效重叠，大幅降低了训练过程中的跨节点通信流量。InternLM 支持 7B 模型从 8 卡扩展到 1024 卡，千卡规模下加速效率可高达 90%，训练吞吐超过 180TFLOPS，平均单卡每秒处理的 token 数量超过3600。下表为 InternLM 在不同配置下的扩展性测试数据：
-
-| GPU Number         | 8   | 16  | 32  | 64  | 128  | 256  | 512  | 1024  |
-| ---------------- | ---- | ---- | ---- | ---- | ----- | ----- | ----- | ------ |
-| TGS | 4078 | 3939 | 3919 | 3944 | 3928  | 3920  | 3835  | 3625   |
-| TFLOPS  | 193 | 191  | 188  | 188  | 187   | 185   | 186   | 184    |
-
-TGS 代表平均每GPU每秒可以处理的 Token 数量。更多的性能测试数据可参考[训练性能文档](./doc/train_performance.md)进一步了解。
-
-## 贡献
-
-我们感谢所有的贡献者为改进和提升 InternLM 所作出的努力。非常欢迎社区用户能参与进项目中来。请参考贡献指南来了解参与项目贡献的相关指引。
-
-## 致谢
-
-InternLM 代码库是一款由上海人工智能实验室和来自不同高校、企业的研发人员共同参与贡献的开源项目。我们感谢所有为项目提供新功能支持的贡献者，以及提供宝贵反馈的用户。 我们希望这个工具箱和基准测试可以为社区提供灵活高效的代码工具，供用户微调 InternLM 并开发自己的新模型，从而不断为开源社区提供贡献。特别鸣谢[flash-attention](https://github.com/HazyResearch/flash-attention) 与 [ColossalAI](https://github.com/hpcaitech/ColossalAI) 两项开源项目。
-
-## 开源许可证
-
-本仓库的代码依照 Apache-2.0 协议开源。模型权重对学术研究完全开放，也可申请免费的商业使用授权（[申请表](https://wj.qq.com/s2/12725412/f7c1/)）。其他问题与合作请联系 <internlm@pjlab.org.cn>。
-
-## 引用
-
-```
-@misc{2023internlm,
-    title={InternLM: A Multilingual Language Model with Progressively Enhanced Capabilities},
-    author={InternLM Team},
-    howpublished = {\url{https://github.com/InternLM/InternLM}},
-    year={2023}
-}
-```
--- a/README.md
+++ b/README.md
@ -17,7 +17,7 @@
 [![license](./assets/license.svg)](./LICENSE)
 [![evaluation](./assets/compass_support.svg)](https://github.com/internLM/OpenCompass/)
 <!-- [![Documentation Status](https://readthedocs.org/projects/internlm/badge/?version=latest)](https://internlm.readthedocs.io/zh_CN/latest/?badge=latest) -->
-[📘Inference](./inference) |
+[📘Chat](./chat) |
 [🛠️Agent](./agent) |
 [📊Evaluation](./evaluation) |
 [👀Model](./model_cards) |
@ -26,8 +26,7 @@
 [🤔Reporting Issues](https://github.com/InternLM/InternLM/issues/new)

 [English](./README.md) |
-[简体中文](./README-zh-Hans.md) |
-[日本語](./README-ja-JP.md)
+[简体中文](./README_zh-CN.md) |

 </div>

@ -37,17 +36,17 @@

 ## Introduction

- **Long (200k) context support**: Both base and chat models can work with more than 200K context after being sufficiently trained on 32K-context data. Try it with [LMDeploy](./inference/) for 200K-context inference.
+- **200K Context window**: Both base and chat models can work with more than 200K context after being sufficiently trained on 32K-context data. Try it with [LMDeploy](./inference/) for 200K-context inference.

- **Math & code capability**: InternLM2 series has significantly better performance on Math and Code as general base and chat models.
+- **Outstanding comprehensive performance**: Significantly better than the last generation in all dimensions including reasoning, code, chat experience, instruction following, and creative writing.

- **Language generation capability**: Better language generation (chat, writing, Chinese poem, etc.) quality due to high-quality training corpus.
+- **Code interpreter & Data analysis**: New state-of-the-art results in using code interpreter for math problems, also good at data analysis.

- **Agent capability**: State-of-the-art tool utilization capabilities, especially zero-shot tool calling and code interpreter for math and data analysis, in both [streaming](docs/chat_format.md##streaming-style) and [ReAct](docs/chat_format.md##react-style) format. Try it with [Lagent](./agent/).
+- **Stronger tool use**: Excellent zero-shot and multi-step tool calling capabilities, better with [streaming](docs/chat_format.md##streaming-style) and also works with [ReAct](docs/chat_format.md##react-style) format. Try it with [Lagent](./agent/).

 ## News

-[2024.01.17] We release InternLM2-7B and InternLM2-20B and their corresponding chat models with stronger capabilities in all dimensions. See [model zoo below](#model-zoo) or [model cards](./model_cards/) for more details.
+[2024.01.17] We release InternLM2-7B and InternLM2-20B and their corresponding chat models with stronger capabilities in all dimensions. See [model zoo below](#model-zoo) for download or [model cards](./model_cards/) for more details.

 [2023.12.13] InternLM-7B-Chat and InternLM-20B-Chat checkpoints are updated. With an improved finetuning strategy, the new chat models can generate higher quality responses with greater stylistic diversity.

@ -57,14 +56,14 @@

 | Model | Transformers(HF) | ModelScope(HF) | OpenXLab(HF) | Release Date |
 |---------------------------|------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
-| **InternLM2 Chat 20B**     | [🤗internlm/internlm-chat-20b](https://huggingface.co/internlm/internlm2-chat-20b)         | [<img src="./assets/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm2-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b/summary)         | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b)     | 2024-01-17   |
-| **InternLM2 20B** | [🤗internlm/internlm2-20b](https://huggingface.co/internlm/internlm2-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm2-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-20b) | 2024-01-17 |
-| **InternLM2 Chat 20B SFT**     | [🤗internlm/internlm-chat-20b-sft](https://huggingface.co/internlm/internlm2-chat-20b-sft)         | [<img src="./assets/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm2-chat-20b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b-sft/summary)         | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b-sft)     | 2024-01-17   |
-| **InternLM2 Base 20B** | [🤗internlm/internlm2-base-20b](https://huggingface.co/internlm/internlm2-base-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm2-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-20b) | 2024-01-17 |
-| **InternLM2 Chat 7B**      | [🤗internlm/internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b)           | [<img src="./assets/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm2-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b/summary)           | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b)      | 2024-01-17  |
-| **InternLM2 7B**           | [🤗internlm/internlm2-7b](https://huggingface.co/internlm/internlm2-7b)                     | [<img src="./assets/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b/summary)                     | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b)           |  2024-01-17   |
-| **InternLM2 Chat 7B SFT**      | [🤗internlm/internlm2-chat-7b-sft](https://huggingface.co/internlm/internlm2-chat-7b-sft)           | [<img src="./assets/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm2-chat-7b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b-sft/summary)           | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b-sft)      | 2024-01-17  |
-| **InternLM2 Base 7B**           | [🤗internlm/internlm2-base-7b](https://huggingface.co/internlm/internlm2-base-7b)                     | [<img src="./assets/modelscope_logo.png" width="20px" /> Shanghai_AI_Laboratory/internlm2-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-7b/summary)                     | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-7b)           |  2024-01-17   |
+| **InternLM2 Chat 20B**     | [🤗internlm/internlm-chat-20b](https://huggingface.co/internlm/internlm2-chat-20b)         | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b/summary)         | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b)     | 2024-01-17   |
+| **InternLM2 20B** | [🤗internlm/internlm2-20b](https://huggingface.co/internlm/internlm2-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-20b) | 2024-01-17 |
+| **InternLM2 Chat 20B SFT**     | [🤗internlm/internlm-chat-20b-sft](https://huggingface.co/internlm/internlm2-chat-20b-sft)         | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-20b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b-sft/summary)         | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b-sft)     | 2024-01-17   |
+| **InternLM2 Base 20B** | [🤗internlm/internlm2-base-20b](https://huggingface.co/internlm/internlm2-base-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-20b) | 2024-01-17 |
+| **InternLM2 Chat 7B**      | [🤗internlm/internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b)           | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b/summary)           | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b)      | 2024-01-17  |
+| **InternLM2 7B**           | [🤗internlm/internlm2-7b](https://huggingface.co/internlm/internlm2-7b)                     | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b/summary)                     | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b)           |  2024-01-17   |
+| **InternLM2 Chat 7B SFT**      | [🤗internlm/internlm2-chat-7b-sft](https://huggingface.co/internlm/internlm2-chat-7b-sft)           | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-7b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b-sft/summary)           | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b-sft)      | 2024-01-17  |
+| **InternLM2 Base 7B**           | [🤗internlm/internlm2-base-7b](https://huggingface.co/internlm/internlm2-base-7b)                     | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-7b/summary)                     | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-7b)           |  2024-01-17   |

 **Note:**

@ -92,13 +91,6 @@ To load the InternLM2 7B Chat model using Transformers, use the following code:
 Hello! How can I help you today?
 >>> response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
 >>> print(response)
-Sure, here are three tips for effective time management:
-
-1. Prioritize tasks based on importance and urgency: Make a list of all your tasks and categorize them into "important and urgent," "important but not urgent," and "not important but urgent." Focus on completing the tasks in the first category before moving on to the others.
-2. Use a calendar or planner: Write down deadlines and appointments in a calendar or planner so you don't forget them. This will also help you schedule your time more effectively and avoid overbooking yourself.
-3. Minimize distractions: Try to eliminate any potential distractions when working on important tasks. Turn off notifications on your phone, close unnecessary tabs on your computer, and find a quiet place to work if possible.
-
-Remember, good time management skills take practice and patience. Start with small steps and gradually incorporate these habits into your daily routine.
 ```

 ### Import from ModelScope
@ -128,7 +120,7 @@ pip install transformers==4.30.2
 streamlit run ./chat/web_demo.py
 ```

-The effect is as follows
+The effect is similar to below:

 ![demo](https://github.com/InternLM/InternLM/assets/9102141/11b60ee0-47e4-42c0-8278-3051b2f17fe4)

@ -151,7 +143,9 @@ InternLM2-Chat models have excellent tool utilization capabilities and can work

 ## Fine-tuning

-Please refer to [finetune docs](./finetune) for fine-tuning with InternLM.
+Please refer to [finetune docs](./finetune/) for fine-tuning with InternLM.
+
+**Note:** We have migrated the whole training functionality in this project to [InternEvo](https://github.com/InternLM/InternEvo) for easier user experience, which provides efficient pre-training and fine-tuning infra for training InternLM.

 ## Contribution

--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@ -0,0 +1,155 @@
+# InternLM
+
+<div align="center">
+
+<img src="./assets//logo.svg" width="200"/>
+  <div>&nbsp;</div>
+  <div align="center">
+    <b><font size="5">书生·浦语 官网</font></b>
+    <sup>
+      <a href="https://internlm.intern-ai.org.cn/">
+        <i><font size="4">HOT</font></i>
+      </a>
+    </sup>
+    <div>&nbsp;</div>
+  </div>
+
+[![license](./assets//license.svg)](https://github.com/open-mmlab/mmdetection/blob/main/LICENSE)
+[![evaluation](./assets//compass_support.svg)](https://github.com/internLM/OpenCompass/)
+<!-- [![Documentation Status](https://readthedocs.org/projects/internlm/badge/?version=latest)](https://internlm.readthedocs.io/zh_CN/latest/?badge=latest) -->
+
+[📘对话教程](./chat) |
+[🛠️智能体教程](./agent) |
+[📊评测](./evaluation) |
+[👀模型库](./model_cards) |
+[🤗HuggingFace](https://huggingface.co/spaces/internlm/internlm2-Chat-7B) |
+[🆕Update News](#news) |
+[🤔提交反馈](https://github.com/InternLM/InternLM/issues/new)
+
+[English](./README.md) |
+[简体中文](./README_zh-CN.md) |
+
+</div>
+
+<p align="center">
+    👋 加入我们的 <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a> 和 <a href="https://github.com/InternLM/InternLM/assets/25839884/a6aad896-7232-4220-ac84-9e070c2633ce" target="_blank">微信社区</a>
+</p>
+
+## 简介
+
+InternLM 是一个开源的轻量级训练框架，旨在支持大模型训练而无需大量的依赖。通过单一的代码库，它支持在拥有数千个 GPU 的大型集群上进行预训练，并在单个 GPU 上进行微调，同时实现了卓越的性能优化。在1024个 GPU 上训练时，InternLM 可以实现近90%的加速效率。
+
+基于InternLM训练框架，我们已经发布了两个开源的预训练模型：InternLM-7B 和 InternLM-20B。
+
+## 更新
+
+[2024.01.17] 我们发布了 InternLM2-7B 和 InternLM2-20B 以及相关的对话模型，InternLM2 在数理、代码、对话、创作等各方面能力都获得了长足进步，综合性能达到开源模型的领先水平。可以点击 [下面的模型库](#model-zoo)进行下载或者[查看模型文档](./model_cards/)来了解更多细节.
+
+[2023.12.13] 我们更新了 InternLM-7B-Chat 和 InternLM-20B-Chat 模型权重。通过改进微调数据和训练策略，新版对话模型生成的回复质量更高、语言风格更加多元。
+
+[2023.09.20] InternLM-20B 已发布，包括基础版和对话版。
+
+## Model Zoo
+
+| Model | Transformers(HF) | ModelScope(HF) | OpenXLab(HF) | Release Date |
+|---------------------------|------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
+| **InternLM2 Chat 20B**     | [🤗internlm/internlm-chat-20b](https://huggingface.co/internlm/internlm2-chat-20b)         | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b/summary)         | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b)     | 2024-01-17   |
+| **InternLM2 20B** | [🤗internlm/internlm2-20b](https://huggingface.co/internlm/internlm2-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-20b) | 2024-01-17 |
+| **InternLM2 Chat 20B SFT**     | [🤗internlm/internlm-chat-20b-sft](https://huggingface.co/internlm/internlm2-chat-20b-sft)         | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-20b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b-sft/summary)         | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-20b-sft)     | 2024-01-17   |
+| **InternLM2 Base 20B** | [🤗internlm/internlm2-base-20b](https://huggingface.co/internlm/internlm2-base-20b) | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-20b/summary) | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-20b) | 2024-01-17 |
+| **InternLM2 Chat 7B**      | [🤗internlm/internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b)           | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b/summary)           | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b)      | 2024-01-17  |
+| **InternLM2 7B**           | [🤗internlm/internlm2-7b](https://huggingface.co/internlm/internlm2-7b)                     | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b/summary)                     | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-7b)           |  2024-01-17   |
+| **InternLM2 Chat 7B SFT**      | [🤗internlm/internlm2-chat-7b-sft](https://huggingface.co/internlm/internlm2-chat-7b-sft)           | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-chat-7b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b-sft/summary)           | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-chat-7b-sft)      | 2024-01-17  |
+| **InternLM2 Base 7B**           | [🤗internlm/internlm2-base-7b](https://huggingface.co/internlm/internlm2-base-7b)                     | [<img src="./assets/modelscope_logo.png" width="20px" /> internlm2-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-7b/summary)                     | [![Open in OpenXLab](https://cdn-static.openxlab.org.cn/header/openxlab_models.svg)](https://openxlab.org.cn/models/detail/OpenLMLab/internlm2-base-7b)           |  2024-01-17   |
+
+## 使用案例
+
+接下来我们展示使用 [Transformers](#import-from-transformers), [ModelScope](#import-from-modelscope), 和 [Web demo](#dialogue) 进行推理.
+对话模型采用了 [chatml 格式](./chat/chat_format.md) 来支持通用对话和智能体应用。
+
+### 通过 Transformers 加载
+
+通过以下的代码从 Transformers 加载 InternLM 模型 （可修改模型名称替换不同的模型）
+
+```python
+>>> from transformers import AutoTokenizer, AutoModelForCausalLM
+>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
+>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True).cuda()
+>>> model = model.eval()
+>>> response, history = model.chat(tokenizer, "你好", history=[])
+>>> print(response)
+你好！有什么我可以帮助你的吗？
+>>> response, history = model.chat(tokenizer, "请提供三个管理时间的建议。", history=history)
+>>> print(response)
+```
+
+### 通过 ModelScope 加载
+
+通过以下的代码从 ModelScope 加载 InternLM 模型 （可修改模型名称替换不同的模型）
+
+```python
+from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
+import torch
+model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b')
+tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
+model = AutoModelForCausalLM.from_pretrained(model_dir,device_map="auto",  trust_remote_code=True,torch_dtype=torch.float16)
+model = model.eval()
+response, history = model.chat(tokenizer, "hello", history=[])
+print(response)
+response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
+print(response)
+```
+
+### 通过前端网页对话
+
+可以通过以下代码启动一个前端的界面来与 InternLM Chat 7B 模型进行交互
+
+```bash
+pip install streamlit==1.24.0
+pip install transformers==4.30.2
+streamlit run ./chat/web_demo.py
+```
+
+效果如下
+
+![效果](https://github.com/InternLM/InternLM/assets/9102141/11b60ee0-47e4-42c0-8278-3051b2f17fe4)
+
+### 基于InternLM高性能部署
+
+我们使用 [LMDeploy](https://github.com/InternLM/LMDeploy) 完成 InternLM 的一键部署。
+
+```shell
+python3 -m pip install lmdeploy
+lmdeploy chat turbomind InternLM/internlm-chat-7b --model-name internlm-chat-7b
+```
+
+请参考[部署指南](./chat/lmdeploy.md)了解更多使用案例，更多部署教程则可在[这里](https://github.com/InternLM/LMDeploy)找到。
+
+## 微调&训练
+
+请参考[微调教程](./finetune/)尝试续训或微调 InternLM2。
+
+**注意：**本项目中的全量训练功能已经迁移到了[InternEvo](https://github.com/InternLM/InternEvo)以便捷用户的使用。InternEvo 提供了高效的预训练和微调基建用于训练 InternLM 系列模型。
+
+## 贡献
+
+我们感谢所有的贡献者为改进和提升 InternLM 所作出的努力。非常欢迎社区用户能参与进项目中来。请参考贡献指南来了解参与项目贡献的相关指引。
+
+## 致谢
+
+InternLM 代码库是一款由上海人工智能实验室和来自不同高校、企业的研发人员共同参与贡献的开源项目。我们感谢所有为项目提供新功能支持的贡献者，以及提供宝贵反馈的用户。 我们希望这个工具箱和基准测试可以为社区提供灵活高效的代码工具，供用户微调 InternLM 并开发自己的新模型，从而不断为开源社区提供贡献。特别鸣谢[flash-attention](https://github.com/HazyResearch/flash-attention) 与 [ColossalAI](https://github.com/hpcaitech/ColossalAI) 两项开源项目。
+
+## 开源许可证
+
+本仓库的代码依照 Apache-2.0 协议开源。模型权重对学术研究完全开放，也可申请免费的商业使用授权（[申请表](https://wj.qq.com/s2/12725412/f7c1/)）。其他问题与合作请联系 <internlm@pjlab.org.cn>。
+
+## 引用
+
+```
+@misc{2023internlm,
+    title={InternLM: A Multilingual Language Model with Progressively Enhanced Capabilities},
+    author={InternLM Team},
+    howpublished = {\url{https://github.com/InternLM/InternLM}},
+    year={2023}
+}
+```
--- a/chat/README.md
+++ b/chat/README.md
@ -0,0 +1,61 @@
+# Chat
+
+English | [简体中文](lmdeploy_zh_zh-CN.md)
+
+This document briefly shows how to use [Transformers](#import-from-transformers), [ModelScope](#import-from-modelscope), and [Web demos](#dialogue) to conduct inference with InternLM2-Chat.
+
+You can also know more about the [chatml format](./chat_format.md) and how to use [LMDeploy for inference and model serving](./lmdeploy.md).
+
+## Import from Transformers
+
+To load the InternLM2 7B Chat model using Transformers, use the following code:
+
+```python
+>>> from transformers import AutoTokenizer, AutoModelForCausalLM
+>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
+>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True).cuda()
+>>> model = model.eval()
+>>> response, history = model.chat(tokenizer, "hello", history=[])
+>>> print(response)
+Hello! How can I help you today?
+>>> response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
+>>> print(response)
+Sure, here are three tips for effective time management:
+
+1. Prioritize tasks based on importance and urgency: Make a list of all your tasks and categorize them into "important and urgent," "important but not urgent," and "not important but urgent." Focus on completing the tasks in the first category before moving on to the others.
+2. Use a calendar or planner: Write down deadlines and appointments in a calendar or planner so you don't forget them. This will also help you schedule your time more effectively and avoid overbooking yourself.
+3. Minimize distractions: Try to eliminate any potential distractions when working on important tasks. Turn off notifications on your phone, close unnecessary tabs on your computer, and find a quiet place to work if possible.
+
+Remember, good time management skills take practice and patience. Start with small steps and gradually incorporate these habits into your daily routine.
+```
+
+## Import from ModelScope
+
+To load the InternLM model using ModelScope, use the following code:
+
+```python
+from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
+import torch
+model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b')
+tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
+model = AutoModelForCausalLM.from_pretrained(model_dir,device_map="auto",  trust_remote_code=True,torch_dtype=torch.float16)
+model = model.eval()
+response, history = model.chat(tokenizer, "hello", history=[])
+print(response)
+response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
+print(response)
+```
+
+## Dialogue
+
+You can interact with the InternLM Chat 7B model through a frontend interface by running the following code:
+
+```bash
+pip install streamlit==1.24.0
+pip install transformers==4.30.2
+streamlit run ./chat/web_demo.py
+```
+
+The effect is similar to below:
+
+![demo](https://github.com/InternLM/InternLM/assets/9102141/11b60ee0-47e4-42c0-8278-3051b2f17fe4)
--- a/chat/README_zh-CN.md
+++ b/chat/README_zh-CN.md
@ -0,0 +1,51 @@
+# 对话
+
+[English](lmdeploy.md) | 简体中文
+
+本文介绍采用 [Transformers](#import-from-transformers)、[ModelScope](#import-from-modelscope)、[Web demos](#dialogue)
+对 InternLM2-Chat 进行推理。
+
+你还可以进一步了解 InternLM2-Chat 采用的[对话格式](./chat_format_zh-CN.md)，以及如何[用 LMDeploy 进行推理或部署服务](./lmdeploy_zh-CN.md)，或者尝试用 [OpenAOE](./openaoe.md) 与多个模型对话。
+
+## 通过 Transformers 加载
+
+通过以下的代码从 Transformers 加载 InternLM 模型 （可修改模型名称替换不同的模型）
+
+```python
+>>> from transformers import AutoTokenizer, AutoModelForCausalLM
+>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
+>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True).cuda()
+>>> model = model.eval()
+>>> response, history = model.chat(tokenizer, "你好", history=[])
+>>> print(response)
+你好！有什么我可以帮助你的吗？
+>>> response, history = model.chat(tokenizer, "请提供三个管理时间的建议。", history=history)
+>>> print(response)
+```
+
+### 通过 ModelScope 加载
+
+通过以下的代码从 ModelScope 加载 InternLM2-Chat 模型 （可修改模型名称替换不同的模型）
+
+```python
+from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
+import torch
+model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b', revision='v1.0.0')
+tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
+model = AutoModelForCausalLM.from_pretrained(model_dir,device_map="auto",  trust_remote_code=True,torch_dtype=torch.float16)
+model = model.eval()
+response, history = model.chat(tokenizer, "hello", history=[])
+print(response)
+response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
+print(response)
+```
+
+## 通过前端网页对话
+
+可以通过以下代码启动一个前端的界面来与 InternLM2 Chat 7B 模型进行交互
+
+```bash
+pip install streamlit==1.24.0
+pip install transformers==4.30.2
+streamlit run ./web_demo.py
+```
--- a/chat/chat_format.md
+++ b/chat/chat_format.md
--- a/chat/chat_format_zh-CN.md
+++ b/chat/chat_format_zh-CN.md
@ -0,0 +1,99 @@
+# 对话格式
+
+[English](chat_format.md) | 简体中文
+
+InternLM2-Chat 采用了全新的对话格式，以灵活地支持工具调用等更广泛的应用，并避免用户输入的攻击。新的对话格式和 [ChatML](https://github.com/openai/openai-python/blob/release-v0.28.0/chatml.md) 格式类似，但是为了支持通用的智能体应用，在 `system`，`user`，`assistant` 的基础上，引入了 `environment` 角色。
+
+## 基本结构
+
+常规的对话结构一般包含 `system`，`user`，`assistant` 三个角色，采用如下格式进行多轮对话
+
+```
+[UNUSED_TOKEN_146]system
+你是书生浦语2，一个无害的人工智能助手[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]user
+你好呀[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]assistant
+你好，我是书生浦语，请问有什么可以帮助你的吗[UNUSED_TOKEN_145]
+```
+
+其中 `[UNUSED_TOKEN_146]` 充当了每轮对话开始符，`[UNUSED_TOKEN_145]` 充当了当前轮对话结束符。每轮对话一般以 `[UNUSED_TOKEN_146]role` 开头，以模型输出的 `[UNUSED_TOKEN_145]` 结尾，role 代表 `system`，`user`，`assistant` 和 `environment` 角色。目前，InternLM2-Chat 模型的词表中还维护了如下映射
+
+- `[UNUSED_TOKEN_146]`：每个角色对话的开始符
+- `[UNUSED_TOKEN_145]`：每个角色对话的结束符
+- `[UNUSED_TOKEN_144]`：模型调用外部插件的开始符
+- `[UNUSED_TOKEN_143]`：模型调用外部插件的结束符
+- `[UNUSED_TOKEN_142]`：代码解释器
+- `[UNUSED_TOKEN_141]`：外部插件，常规的 tools
+
+## 完整结构
+
+InternLM2-Chat 的完整对话格式在上述基本结构的基础上还包含了针对通用智能体的设计，其核心目的是采用流式格式，使得同一套格式在支持各种类插件拓展和智能体环境的同时能够和通用对话兼容。通用的智能体对话状态如下所示
+
+```
+[UNUSED_TOKEN_146]system
+你是书生浦语2，一个无害的人工智能助手[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]system name=[UNUSED_TOKEN_141]
+[
+    {
+        "name": "get_current_weather",
+        "description": "Get the current weather in a given location",
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "location": {
+                    "type": "string",
+                    "description": "The city and state, e.g. San Francisco, CA",
+                },
+                "unit": {"type": "string"},
+            },
+            "required": ["location"],
+        },
+    }
+]
+[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]user
+请帮我对该数据集进行数据处理并可视化。
+[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]user name=file
+{"path": "data.csv"}[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]assistant
+我已经帮您处理了数据并进行了可视化。
+
+[UNUSED_TOKEN_144][UNUSED_TOKEN_142]
+```python
+import plotly.express as px
+import pandas as pd
+
+# Load the data into a pandas dataframe
+df = pd.read_csv('data.csv')
+
+# Create a scatter plot of rainfall vs wind direction
+fig = px.scatter(df, x='WindDir9am', y='Rainfall', color='WindDir3pm',
+                 labels={'WindDir9am': 'Wind Direction 9am', 'Rainfall': 'Rainfall', 'WindDir3pm': 'Wind Direction 3pm'},
+                 title='Rainfall vs Wind Direction',
+                 template='plotly_dark',
+                 width=600, height=500)
+
+# Add a hover effect to show the date
+fig.update_traces(hovertemplate='<b>Date: %{text}</b><br>Wind Direction 9am: %{x}<br>Rainfall: %{y}<br>Wind Direction 3pm: %{marker.color}')
+
+# Show the plot
+fig.show()
+```[UNUSED_TOKEN_143]
+[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]environment name=[UNUSED_TOKEN_142]
+[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]assistant
+由于代码输出没有包含在这里，我无法提供具体的图表内容。但是，如果代码执行无误，它会显示一个极坐标图，其中包含两个填充区域，一个代表上午9点的风向与降雨量的关系，另一个代表下午3点的风向与降雨量的关系。每个方向上的值是根据提供的数据集计算得到的平均降雨量。图表应该有一个清晰的标题、图例，并且能够直观地比较不同风向下的降雨量。由于使用了暗色主题，图表的整体外观应该是深色背景上的亮色线条和填充区域。[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]user
+你能解释一下代码吗[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]assistant
+我使用的是Python的Plotly库来创建一个极坐标图，以展示风向与降雨量之间的关系。代码中首先创建了一个Plotly的Figure对象。接着，我添加了两个`Scatterpolar`追踪。第一个追踪展示的是上午9点的风向与降雨量的平均值的关系，第二个追踪展示的是下午3点的风向与降雨量的平均值的关系。在这两个追踪中，`r`参数代表降雨量的平均值，`theta`参数代表风向。`fill='toself'`表示图形以自身为边界进行填充。然后我更新了图表的布局，隐藏了径向轴，添加了图例，设置了标题和标题字体，选择了暗色主题，并设定了图表的背景色、宽度和高度。最后，使用`fig.show()`展示了图表。
+[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_146]user
+我想了解今天上海的天气[UNUSED_TOKEN_145]
+[UNUSED_TOKEN_144][UNUSED_TOKEN_141]
+{"name": "get_current_weather", "parameters": {"location": "上海"}}[UNUSED_TOKEN_143]
+[UNUSED_TOKEN_145]
+```
--- a/inference/lmdeploy.md
+++ b/inference/lmdeploy.md
@ -1,12 +1,11 @@
 # Inference by LMDeploy

-English | [简体中文](lmdeploy_zh_cn.md)
+English | [简体中文](lmdeploy_zh_zh-CN.md)

 [LMDeploy](https://github.com/InternLM/lmdeploy) is an efficient, user-friendly toolkit designed for compressing, deploying, and serving LLM models.

 This article primarily highlights the basic usage of LMDeploy. For a comprehensive understanding of the toolkit, we invite you to refer to [the tutorials](https://lmdeploy.readthedocs.io/en/latest/).

-
 ## Installation

 Install lmdeploy with pip (python 3.8+)
--- a/inference/lmdeploy_zh_cn.md
+++ b/inference/lmdeploy_zh_cn.md