fix Performance Evaluation if internlm2-1.8b

2024-02-21 17:54:50 +08:00 · 2024-02-21 17:54:50 +08:00 · ab9ae22031
parent 33dae04941
commit ab9ae22031
2 changed files with 8 additions and 8 deletions
--- a/README.md
+++ b/README.md
@ -47,7 +47,7 @@ InternLM2 series are released with the following features:

 ## News

-\[2024.01.31\] We release InternLM2-1.8B, along with the associated chat model. This model provides a cheaper deployment option while maintaining leading performance.
+\[2024.01.31\] We release InternLM2-1.8B, along with the associated chat model. They provide a cheaper deployment option while maintaining leading performance.

 \[2024.01.23\] We release InternLM2-Math-7B and InternLM2-Math-20B with pretraining and SFT checkpoints. They surpass ChatGPT with small sizes. See [InternLM-Math](https://github.com/InternLM/internlm-math) for details and download.

--- a/model_cards/internlm2_1.8b.md
+++ b/model_cards/internlm2_1.8b.md
@ -27,13 +27,13 @@ We have evaluated InternLM2 on several important benchmarks using the open-sourc

 | Dataset\Models | InternLM2-1.8B | InternLM2-Chat-1.8B-SFT | InternLM2-Chat-1.8B | InternLM2-7B | InternLM2-Chat-7B |
 | :---: | :---: | :---: | :---: | :---: | :---: |
-| MMLU | 46.9 | 47.1 | 47.1 | 65.8 | 63.7 |
-| AGIEval | 33.4 | 38.8 | 38.7 | 49.9 | 47.2 |
-| BBH | 37.5 | 35.2 | 36.1 | 65.0 | 61.2 |
-| GSM8K | 31.2 | 39.7 | 40.9 | 70.8 | 70.7 |
-| MATH | 5.6 | 11.8 | 12.1 | 20.2 | 23.0 |
-| HumanEval | 25.0 | 32.9 | 34.2 | 43.3 | 59.8 |
-| MBPP(Sanitized) | 22.2 | 23.2 | 26.6 | 51.8 | 51.4 |
+| MMLU | 46.9 | 47.1 | 44.1 | 65.8 | 63.7 |
+| AGIEval | 33.4 | 38.8 | 34.6 | 49.9 | 47.2 |
+| BBH | 37.5 | 35.2 | 34.3 | 65.0 | 61.2 |
+| GSM8K | 31.2 | 39.7 | 34.3 | 70.8 | 70.7 |
+| MATH | 5.6 | 11.8 | 10.7 | 20.2 | 23.0 |
+| HumanEval | 25.0 | 32.9 | 29.3 | 43.3 | 59.8 |
+| MBPP(Sanitized) | 22.2 | 23.2 | 27.0 | 51.8 | 51.4 |


 - The evaluation results were obtained from [OpenCompass](https://github.com/open-compass/opencompass) , and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/open-compass/opencompass).