diff --git a/README.md b/README.md
index 3ce5fee..d6624b1 100644
--- a/README.md
+++ b/README.md
@@ -213,7 +213,7 @@ We utilize [OpenCompass](https://github.com/open-compass/opencompass) for model
 
 ### Objective Evaluation
 
-To evaluate the InternLM model, please follow the guidelines in the [OpenCompass tutorial](https://github.com/open-compass/opencompass). Typically, we use `ppl` for multiple-choice questions on the **Base** model and `gen` for all questions on the **Chat** model.
+To evaluate the InternLM model, please follow the guidelines in the [OpenCompass tutorial](https://opencompass.readthedocs.io/en/latest/get_started/installation.html). Typically, we use `ppl` for multiple-choice questions on the **Base** model and `gen` for all questions on the **Chat** model.
 
 ### Long-Context Evaluation (Needle in a Haystack)
 
diff --git a/README_zh-CN.md b/README_zh-CN.md
index aa0a80e..228878d 100644
--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -201,14 +201,13 @@ print(response)
 
 **注意：**本项目中的全量训练功能已经迁移到了[InternEvo](https://github.com/InternLM/InternEvo)以便捷用户的使用。InternEvo 提供了高效的预训练和微调基建用于训练 InternLM 系列模型。
 
-
 ## 评测
 
 我们使用 [OpenCompass](https://github.com/open-compass/opencompass) 进行模型评估。在 InternLM-2 中，我们主要标准客观评估、长文评估（大海捞针）、数据污染评估、智能体评估和主观评估。
 
 ### 标准客观评测
 
-请按照 [OpenCompass 教程](https://github.com/open-compass/opencompass) 进行客观评测。我们通常在 **Base** 模型上使用 `ppl` 进行多项选择题，在 **Chat** 模型上使用 `gen` 进行所有问题。
+请按照 [OpenCompass 教程](https://opencompass.readthedocs.io/zh-cn/latest/get_started/installation.html) 进行客观评测。我们通常在 Base 模型上使用 ppl 进行多项选择题评测，在 Chat 模型上使用 gen 进行所有问题的答案生成和评测。
 
 ### 长文评估（大海捞针）